Skip to content
DevToolKit

TAR Archive & Extract

Create and extract TAR archives entirely in your browser. Supports .tar and .tar.gz/.tgz with file explorer view, individual downloads, and batch ZIP export. No uploads, no server processing.

tartar.gztgz

Drop a .tar, .tar.gz, or .tgz file

Auto-detects GZIP compression by magic bytes

Client-Side TAR Processing

This tool parses and creates TAR archives entirely in your browser using a custom 512-byte block parser. For .tar.gz and .tgz files, the native DecompressionStream API handles GZIP decompression before TAR parsing. No files are uploaded anywhere.

Processed locally
Was this tool helpful?

How to Use

Create TAR archives from multiple files or extract the contents of existing .tar, .tar.gz, and .tgz archives. Everything runs in your browser with no server interaction.

Extracting a TAR Archive

  1. Select Unpack mode using the segmented control at the top. This is the default mode when the tool loads.
  2. Drop or select your archive file. Accepted formats include .tar, .tar.gz, and .tgz. The tool auto-detects GZIP compression by checking for magic bytes 0x1f 0x8b at the start of the file.
  3. Browse the file explorer. After parsing, the tool displays every file and directory in the archive with its name, size, type, and last modification date. The file explorer supports scrolling for archives with hundreds of entries.
  4. Download individual files by clicking the download icon next to any file entry, or use Extract All as ZIP to download the entire archive contents as a single ZIP file.

Creating a TAR Archive

  1. Switch to Pack mode using the segmented control.
  2. Add files by dropping them onto the dropzone or clicking to browse. You can add files in batches — each new drop appends to the existing file list.
  3. Review the file list. The tool displays all queued files with their names and sizes. Remove any file by clicking the × button next to it.
  4. Click Create TAR Archive. The tool reads each file into memory, constructs 512-byte ustar headers with correct checksums, pads file data to block boundaries, and appends the end-of-archive marker.
  5. Download the result. The generated .tar file is ready for download. You can compress it further using the GZIP tool to create a .tar.gz.

About This Tool

The TAR Format: From Tape to Cloud

TAR stands for Tape Archive, a format originally designed in 1979 for writing sequential backups to magnetic tape drives on Unix systems. Despite its age, TAR remains the dominant archiving format in Linux, macOS, and the open-source ecosystem. Every source code release on GitHub, every Docker image layer, and every Node.js package published to npm uses TAR as its packaging format.

The 512-Byte Block Architecture

TAR's design reflects its tape heritage. Every piece of data is aligned to 512-byte blocks — the natural sector size of magnetic tape. Each file in the archive begins with a 512-byte header block that contains the filename (100 bytes, null-terminated), file permissions in octal ASCII (8 bytes), owner and group IDs (8 bytes each), the file size in octal (12 bytes), a Unix timestamp (12 bytes), a checksum (8 bytes), and a type flag (1 byte). After the header, the file's raw data follows, padded with zero bytes to the next 512-byte boundary. The archive ends with two consecutive zero-filled blocks as an end-of-archive marker.

The ustar Standard (POSIX.1-1988)

The original TAR format limited filenames to 100 characters and had no standardized magic number. The ustar (Unix Standard TAR) extension, codified in POSIX.1-1988, added a 6-byte magic field at byte 257 (ustar) and a 155-byte prefix field at byte 345. By combining the prefix and name fields, ustar supports filenames up to 256 characters. This tool implements ustar for both reading and writing, ensuring compatibility with archives produced by GNU tar, BSD tar, and other modern implementations.

.tar.gz vs .tar: Archive Then Compress

TAR itself applies no compression — it simply concatenates files sequentially. Compression is handled by a separate tool, most commonly GZIP. The convention .tar.gz (or .tgz) indicates a TAR archive compressed with GZIP. This two-stage approach is architecturally superior to formats like ZIP, where each file is compressed individually. Because GZIP sees the entire concatenated stream, it can exploit redundancy across files — common headers in log files, shared boilerplate in source code — achieving significantly better compression ratios. The trade-off is that you cannot extract a single file without decompressing the entire stream, but for most use cases the improved compression outweighs this limitation.

The TAR Checksum Algorithm

Each TAR header includes an 8-byte checksum field at bytes 148-155. The checksum is computed by summing all 512 bytes of the header, with the checksum field itself treated as eight space characters (0x20). This produces an unsigned integer stored as a 6-digit octal string followed by a null byte and a space. The checksum provides basic integrity verification — it detects single-byte corruption but is not a cryptographic hash. For stronger integrity guarantees, pair TAR with a file checksum tool.

Browser-Native Implementation

This tool implements a custom TAR parser in pure JavaScript — no external libraries, no WebAssembly, no server-side processing. The parser reads the raw bytes of the archive using the File.arrayBuffer() API, walks through 512-byte blocks, validates checksums, decodes octal ASCII fields, and extracts file data into Blob objects. For .tar.gz archives, the browser's native DecompressionStream('gzip') handles decompression before the TAR parser processes the uncompressed stream.

Why Use This Tool

When to Use TAR Archives

TAR is the standard archiving format across the Unix, Linux, and open-source ecosystem. Here are the most common scenarios where you will encounter or create TAR archives:

  • Source code distribution — GitHub release tarballs, npm packages, Python sdist packages, and Ruby gems all use .tar.gz as their primary distribution format. When you download a release from GitHub, the "Source code (tar.gz)" link provides a TAR archive.
  • Docker and container images — Every Docker image layer is stored as a TAR archive. When you run docker save, the output is a TAR file containing all image layers. Understanding TAR helps when debugging container builds or inspecting image contents.
  • System backups and migrations — System administrators use tar to create full directory backups that preserve file permissions, ownership, and timestamps. A typical backup command like tar czf backup.tar.gz /home creates a compressed archive of the entire home directory.
  • Log file bundling — Collecting log files from multiple servers or services into a single .tar.gz for analysis or archival. The cross-file compression of GZIP over TAR is especially effective with log files because they share common patterns.
  • Data science and ML datasets — Large datasets (ImageNet, Common Crawl, LAION) are distributed as TAR archives because the format handles millions of files without the directory-entry overhead of ZIP. Streaming TAR extraction is also more memory-efficient than ZIP for large archives.
  • Cross-platform file transfer — When transferring files between macOS, Linux, and Windows systems, TAR preserves Unix file permissions and symbolic links that ZIP may not handle correctly. This matters for development environments and configuration files.

TAR vs ZIP: When to Choose Each

Use TAR + GZIP when you need the best compression ratio, when working in Unix/Linux environments, or when preserving file metadata matters. Use ZIP when sharing with Windows users who may not have tar utilities, when you need to extract individual files without decompressing the entire archive, or when the recipient expects a self-contained compressed archive. Both formats are lossless and widely supported.

Privacy and Security

Your archives are processed entirely within your browser. No data is transmitted to any server — the TAR parser runs as client-side JavaScript, and GZIP decompression uses the browser's native DecompressionStream API. This makes the tool safe for working with proprietary source code, configuration files containing secrets, database exports, or any other sensitive content. You can verify this by opening your browser's Network tab — no requests are made during processing.

FAQ

What TAR formats does this tool support?
This tool handles standard .tar archives using the ustar (POSIX.1-1988) format, plus compressed .tar.gz and .tgz archives. For .tar.gz files, the browser's native DecompressionStream API decompresses the GZIP layer first, then the custom TAR parser reads the 512-byte block-aligned headers and extracts every file and directory entry.
How does TAR differ from ZIP?
TAR (Tape Archive) is a pure archiving format that concatenates files end-to-end with 512-byte headers but applies no compression on its own. ZIP both archives and compresses, storing each file with individual DEFLATE compression. TAR is typically paired with GZIP (.tar.gz) or other compressors for a two-stage archive-then-compress workflow, which often achieves better ratios than ZIP because the compressor sees the entire stream rather than individual files.
Can I extract individual files from a TAR archive?
Yes. After parsing, the tool displays a file explorer tree showing every entry with its name, size, type, and last modified date. You can download any individual file directly, or use the Extract All button to download the entire archive contents as a ZIP file for convenience.
Are my files uploaded to a server?
No. All TAR parsing, packing, and GZIP decompression happens entirely in your browser using JavaScript and the native Web Streams API. Your files never leave your machine. This makes the tool safe for archives containing source code, configuration files, database dumps, or any sensitive data.
What is the 512-byte block structure in TAR?
TAR organizes data into fixed 512-byte blocks. Each file entry begins with a 512-byte header containing the filename (100 bytes), file mode, owner/group IDs, file size in octal, modification timestamp, a checksum, and a type flag. The file data follows immediately after the header, padded to the next 512-byte boundary. The archive ends with two consecutive zero-filled blocks.