the-problem-with-using-a-download-link

 

Download Links

 

There is a common misconception that HTTPS is protecting files on websites. As HTTPS is only encrypting communication and makes sure the server name (FQDN) is the same as covered by the given SSL certificate, it can only protect against MIM (Man-in-the-Middle) attacks. Such being the case, how do users confidently ensure the integrity of files when they use a download link? Read on to learn what the current established best practices are for ensuring file integrity when downloading as well as the potential dangers that exist around them.

 

What Happens if you Click a Download Link on a Website?

You typically download a file using a link that points directly to a download location, which is the most transparent case. Alternatively, sites will use a script to create an individual download link. The used browser is aware of a download request and starts using a built in mechanism i.e. a download manager to ensure the integrity of the data stream. Alternatively, it will rely on a lower-level protocol like TCP to provide this.

 

One can assume this is a reliable precaution against accidental corruption of files in transfer.

 

Doing this means you can assume the download is only marked complete when the file integrity check has been successful.

 

Of course, if you prefer, you can install an additional download manager to take care of the download action. Some common download managers include Free Download Manager and Persepolis.

 

However, in either case, using a download manager doesn’t mean the file you downloaded was actually the file you intended to download.

 

Ways to Corrupt a File Intentionally

When it comes to intentional corruption of files by a Man-in-the-Middle, any attacker who can modify the data-stream of the download can very likely also modify the data-stream which is used to transfer the checksum. As such, checksum verification is not an efficient security precaution. You might say, “But that’s covered by TLS and HTTPS communication.” Yes, that is true, but only if both the website and the download location are protected by HTTPS.

 

If someone gains access to the website where the download link as well as checksum are part of the same page or website, there is no protection left. The attacker can simply replace both the download link and checksum.

 

Therefore, the mentioned checksum (MD5, SHA1, and SHA256 are typically used) is only of value when the file advertising website and the file hosting website are under the control of different entities. A good example of this is when you outsource your downloadable files to a 3rd party file hoster.

 

The Truth about Checksums

A cryptographic hash (sometimes called a ‘digest’) is a kind of ‘signature’ or ‘fingerprint’ for a text or a data file.

 

A hash is not ‘encryption’. It cannot be decrypted back to the original text. It is a ‘one-way’ cryptographic function that is a fixed size for any size of source text. Thus, hashes are suitable for password validation, challenge authentication, anti-tampering, and digital signatures.

 

Let’s get into the different checksum algorithms first:

 

MD5: It’s the most popular checksum algorithm but it is completely broken from a security perspective. The probability of an accidental collision is still vanishingly small, but exists.

 

SHA1: Secure Hash Algorithm generates an almost-unique 160-bit (20-byte) signature for any given text. However, SHA-1 is no longer recommended for cryptographic purposes. Google has now achieved a collision attack on SHA-1. While it might still be safe to use SHA-1 most of the time, the message is clear – switch to SHA-256 or SHA-3. There is a good description on Wikipedia if you’d like to learn more.

 

SHA-256 or SHA-3: SHA-256 algorithm generates an almost-unique, fixed size 256-bit (32-byte) hash. SHA-256 is one of the successor hash functions to SHA-1, and is one of the strongest hash functions available.

 

Truth is, while the checksum algorithm used might be pretty safe, the checksum calculation and comparison process is mostly manual. Even worse, it’s not easy to understand!

 

The manual nature of the process also means that a website owner has to provide checksums to his users as a type of visual safety net. Unfortunately, now it’s up to the downloading user to create a checksum of the downloaded file on his end in order to compare it with the website listed checksum.

 

That means it’s manual, error-prone, and optional AKA unsafe!

 

Final Thoughts

As you can see, it’s easy to understand why many users might think their downloads are safe despite how illusory and thinly veiled that safety can actually be. In fact, clicking a download link may deliver an altered version and not the original file that was intended. The best practice of verifying the checksum of your downloaded copy and matching it to the publisher’s listed checksum is an unsafe manual and error-prone process.

 

In our next blog in this series, we will explore ways to use checksums and how to verify their correctness.

 

In the meantime, if you liked this blog and think code-signing and digital certificates are a good way to verify the integrity of a downloaded file, you’ll want to think again. Check out our blog ‘Avoid the Digital and Code Signing Certificates’ Journey of Pain’.

 

 

Please, Take Me to the Blog

CNIL
Metrics and Logs

(formerly, Opvizor Performance Analyzer)

VMware vSphere & Cloud
PERFORMANCE MONITORING, LOG ANALYSIS, LICENSE COMPLIANCE!

Monitor and Analyze Performance and Log files:
Performance monitoring for your systems and applications with log analysis (tamperproof using immudb) and license compliance (RedHat, Oracle, SAP and more) in one virtual appliance!

Subscribe to Our Newsletter

Get the latest product updates, company news, and special offers delivered right to your inbox.

Subscribe to our newsletter

Use Case - Tamper-resistant Clinical Trials

Goal:

Blockchain PoCs were unsuccessful due to complexity and lack of developers.

Still the goal of data immutability as well as client verification is a crucial. Furthermore, the system needs to be easy to use and operate (allowing backup, maintenance windows aso.).

Implementation:

immudb is running in different datacenters across the globe. All clinical trial information is stored in immudb either as transactions or the pdf documents as a whole.

Having that single source of truth with versioned, timestamped, and cryptographically verifiable records, enables a whole new way of transparency and trust.

Use Case - Finance

Goal:

Store the source data, the decision and the rule base for financial support from governments timestamped, verifiable.

A very important functionality is the ability to compare the historic decision (based on the past rulebase) with the rulebase at a different date. Fully cryptographic verifiable Time Travel queries are required to be able to achieve that comparison.

Implementation:

While the source data, rulebase and the documented decision are stored in verifiable Blobs in immudb, the transaction is stored using the relational layer of immudb.

That allows the use of immudb’s time travel capabilities to retrieve verified historic data and recalculate with the most recent rulebase.

Use Case - eCommerce and NFT marketplace

Goal:

No matter if it’s an eCommerce platform or NFT marketplace, the goals are similar:

  • High amount of transactions (potentially millions a second)
  • Ability to read and write multiple records within one transaction
  • prevent overwrite or updates on transactions
  • comply with regulations (PCI, GDPR, …)


Implementation:

immudb is typically scaled out using Hyperscaler (i. e. AWS, Google Cloud, Microsoft Azure) distributed across the Globe. Auditors are also distributed to track the verification proof over time. Additionally, the shop or marketplace applications store immudb cryptographic state information. That high level of integrity and tamper-evidence while maintaining a very high transaction speed is key for companies to chose immudb.

Use Case - IoT Sensor Data

Goal:

IoT sensor data received by devices collecting environment data needs to be stored locally in a cryptographically verifiable manner until the data is transferred to a central datacenter. The data integrity needs to be verifiable at any given point in time and while in transit.

Implementation:

immudb runs embedded on the IoT device itself and is consistently audited by external probes. The data transfer to audit is minimal and works even with minimum bandwidth and unreliable connections.

Whenever the IoT devices are connected to a high bandwidth, the data transfer happens to a data center (large immudb deployment) and the source and destination date integrity is fully verified.

Use Case - DevOps Evidence

Goal:

CI/CD and application build logs need to be stored auditable and tamper-evident.
A very high Performance is required as the system should not slow down any build process.
Scalability is key as billions of artifacts are expected within the next years.
Next to a possibility of integrity validation, data needs to be retrievable by pipeline job id or digital asset checksum.

Implementation:

As part of the CI/CD audit functionality, data is stored within immudb using the Key/Value functionality. Key is either the CI/CD job id (i. e. Jenkins or GitLab) or the checksum of the resulting build or container image.

White Paper — Registration

We will also send you the research paper
via email.

CodeNotary — Webinar

White Paper — Registration

Please let us know where we can send the whitepaper on CodeNotary Trusted Software Supply Chain. 

Become a partner

Start Your Trial

Please enter contact information to receive an email with the virtual appliance download instructions.

Start Free Trial

Please enter contact information to receive an email with the free trial details.