Home Shell command to tar directory excluding certain files/folders

Shell command to tar directory excluding certain files/folders

deepwell
1#
deepwell Published in 2009-06-11 22:57:31Z
 Is there a simple shell command/script that supports excluding certain files/folders from being archived? I have a directory that need to be archived with a sub directory that has a number of very large files I do not need to backup. Not quite solutions: The tar --exclude=PATTERN command matches the given pattern and excludes those files, but I need specific files & folders to be ignored (full file path), otherwise valid files might be excluded. I could also use the find command to create a list of files and exclude the ones I don't want to archive and pass the list to tar, but that only works with for a small amount of files. I have tens of thousands. I'm beginning to think the only solution is to create a file with a list of files/folders to be excluded, then use rsync with --exclude-from=file to copy all the files to a tmp directory, and then use tar to archive that directory. Can anybody think of a better/more efficient solution? EDIT: cma's solution works well. The big gotcha is that the --exclude='./folder' MUST be at the beginning of the tar command. Full command (cd first, so backup is relative to that directory): cd /folder_to_backup tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz . 
ericosg
2#
 You can have multiple exclude options for tar so $tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz .  etc will work. Make sure to put --exclude before the source and destination items. Alex B 3# Alex B Reply to 2009-06-11 23:03:57Z  Use the find command in conjunction with the tar append (-r) option. This way you can add files to an existing tar in a single step, instead of a two pass solution (create list of files, create tar). find /dir/dir -prune ... -o etc etc.... -exec tar rvf ~/tarfile.tar {} \;  Joe 4# Joe Reply to 2009-06-11 23:04:12Z  Your best bet is to use find with tar, via xargs (to handle the large number of arguments). For example: find / -print0 | xargs -0 tar cjf tarfile.tar.bz2  fedorqui 5# fedorqui Reply to 2014-12-11 12:13:53Z  You can exclude directories with --exclude for tar. If you want to archive everything except /usr you can use: tar -zcvf /all.tgz / --exclude=/usr  In your case perhaps something like tar -zcvf archive.tgz arc_dir --exclude=dir/ignore_this_dir  camh 6# camh Reply to 2009-06-12 05:53:17Z  You can use cpio(1) to create tar files. cpio takes the files to archive on stdin, so if you've already figured out the find command you want to use to select the files the archive, pipe it into cpio to create the tar file: find ... | cpio -o -H ustar | gzip -c > archive.tar.gz  Rob 7# Rob Reply to 2010-02-05 21:59:16Z  I found this somewhere else so I won't take credit, but it worked better than any of the solutions above for my mac specific issues (even though this is closed): tar zc --exclude __MACOSX --exclude .DS_Store -f  carlo 8# carlo Reply to 2012-03-04 15:18:30Z  To avoid possible 'xargs: Argument list too long' errors due to the use of find ... | xargs ... when processing tens of thousands of files, you can pipe the output of find directly to tar using find ... -print0 | tar --null .... # archive a given directory, but exclude various files & directories # specified by their full file paths find "$(pwd -P)" -type d $$-path '/path/to/dir1' -or -path '/path/to/dir2'$$ -prune \ -or -not $$-path '/path/to/file1' -or -path '/path/to/file2'$$ -print0 | gnutar --null --no-recursion -czf archive.tar.gz --files-from - #bsdtar --null -n -czf archive.tar.gz -T - 
frommelmak
9#
 You can also use one of the "--exclude-tag" options depending on your needs: --exclude-tag=FILE --exclude-tag-all=FILE --exclude-tag-under=FILE The folder hosting the specified FILE will be excluded.
Stephen Donecker
10#
Stephen Donecker Reply to 2012-11-08 00:22:34Z
 Possible options to exclude files/directories from backup using tar: Exclude files using multiple patterns tar -czf backup.tar.gz --exclude=PATTERN1 --exclude=PATTERN2 ... /path/to/backup  Exclude files using an exclude file filled with a list of patterns tar -czf backup.tar.gz -X /path/to/exclude.txt /path/to/backup  Exclude files using tags by placing a tag file in any directory that should be skipped tar -czf backup.tar.gz --exclude-tag-all=exclude.tag /path/to/backup 
Georgios
11#
 Possible redundant answer but since I found it useful, here it is: While a FreeBSD root (i.e. using csh) I wanted to copy my whole root filesystem to /mnt but without /usr and (obviously) /mnt. This is what worked (I am at /): tar --exclude ./usr --exclude ./mnt --create --file - . (cd /mnt && tar xvd -)  My whole point is that it was necessary (by putting the ./) to specify to tar that the excluded directories where part of the greater directory being copied. My €0.02
Community
12#
 I had no luck getting tar to exclude a 5 Gigabyte subdirectory a few levels deep. In the end, I just used the unix Zip command. It worked a lot easier for me. So for this particular example from the original post (tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz . ) The equivalent would be: zip -r /backup/filename.zip . -x upload/folder/**\* upload/folder2/**\* (NOTE: Here is the post I originally used that helped me https://superuser.com/questions/312301/unix-zip-directory-but-excluded-specific-subdirectories-and-everything-within-t)
GeertVc
13#
 I've experienced that, at least with the Cygwin version of tar I'm using ("CYGWIN_NT-5.1 1.7.17(0.262/5/3) 2012-10-19 14:39 i686 Cygwin" on a Windows XP Home Edition SP3 machine), the order of options is important. While this construction worked for me: tar cfvz target.tgz --exclude='' --exclude='' target_dir  that one didn't work: tar cfvz --exclude='' --exclude='' target.tgz target_dir  This, while tar --help reveals the following: tar [OPTION...] [FILE]  So, the second command should also work, but apparently it doesn't seem to be the case... Best rgds,
Andrew
14#
 gnu tar v 1.26 the --exclude needs to come after archive file and backup directory arguments, should have no leading or trailing slashes, and prefers no quotes (single or double). So relative to the PARENT directory to be backed up, it's: tar cvfz /path_to/mytar.tgz ./dir_to_backup --exclude=some_path/to_exclude
Sverre
15#
 old question with many answers, but I found that none were quite clear enough for me, so I would like to add my try. if you have the following structure /home/ftp/mysite/  with following file/folders /home/ftp/mysite/file1 /home/ftp/mysite/file2 /home/ftp/mysite/file3 /home/ftp/mysite/folder1 /home/ftp/mysite/folder2 /home/ftp/mysite/folder3  so, you want to make a tar file that contain everyting inside /home/ftp/mysite (to move the site to a new server), but file3 is just junk, and everything in folder3 is also not needed, so we will skip those two. we use the format tar -czvf  where the c = create, z = zip, and v = verbose (you can see the files as they are entered, usefull to make sure none of the files you exclude are being added). and f= file. so, my command would look like this cd /home/ftp/ tar -czvf mysite.tar.gz mysite --exclude='file3' --exclude='folder3'  note the files/folders excluded are relatively to the root of your tar (I have tried full path here relative to / but I can not make that work). hope this will help someone (and me next time I google it)
Undo
16#
 After reading this thread, I did a little testing on RHEL 5 and here are my results for tarring up the abc directory: This will exclude the directories error and logs and all files under the directories: tar cvpzf abc.tgz abc/ --exclude='abc/error' --exclude='abc/logs'  Adding a wildcard after the excluded directory will exclude the files but preserve the directories: tar cvpzf abc.tgz abc/ --exclude='abc/error/*' --exclude='abc/logs/*' 
Scott Stensland
17#
Scott Stensland Reply to 2015-02-12 20:55:16Z
 This exclude pattern handles filename suffix like png or mp3 as well as directory names like .git and node_modules tar --exclude={*.png,*.mp3,*.wav,.git,node_modules} -Jcf ${target_tarball}${source_dirname} 
Eric Manley
18#
Eric Manley Reply to 2015-05-14 14:10:55Z
 You can use standard "ant notation" to exclude directories relative. This works for me and excludes any .git or node_module directories. tar -cvf myFile.tar --exclude=**/.git/* --exclude=**/node_modules/* -T /data/txt/myInputFile.txt 2> /data/txt/myTarLogFile.txt  myInputFile.txt Contains: /dev2/java /dev2/javascript
Aaron Votre
19#
Aaron Votre Reply to 2016-07-15 15:56:04Z
 I agree the --exclude flag is the right approach. $tar --exclude='./folder_or_file' --exclude='file_pattern' --exclude='fileA'  A word of warning for a side effect that I did not find immediately obvious: The exclusion of 'fileA' in this example will search for 'fileA' RECURSIVELY! Example:A directory with a single subdirectory containing a file of the same name (data.txt) data.txt config.txt --+dirA | data.txt | config.docx  If using --exclude='data.txt' the archive will not contain EITHER data.txt file. This can cause unexpected results if archiving third party libraries, such as a node_modules directory. To avoid this issue make sure to give the entire path, like --exclude='./dirA/data.txt' RohitPorwal 20# RohitPorwal Reply to 2016-07-21 09:56:34Z  Check it out tar cvpzf zip_folder.tgz . --exclude=./public --exclude=./tmp --exclude=./log --exclude=fileName  Community 21# Community Reply to 2017-05-23 12:02:58Z  The following bash script should do the trick. It uses the answer given here by Marcus Sundman. #!/bin/bash echo -n "Please enter the name of the tar file you wish to create with out extension " read nam echo -n "Please enter the path to the directories to tar " read pathin echo tar -czvf$nam.tar.gz excludes=find $pathin -iname "*.CC" -exec echo "--exclude \'{}\'" \;|xargs echo$pathin echo tar -czvf $nam.tar.gz$excludes \$pathin  This will print out the command you need and you can just copy and paste it back in. There is probably a more elegant way to provide it directly to the command line. Just change *.CC for any other common extension, file name or regex you want to exclude and this should still work. EDIT Just to add a little explanation; find generates a list of files matching the chosen regex (in this case *.CC). This list is passed via xargs to the echo command. This prints --exclude 'one entry from the list'. The slashes () are escape characters for the ' marks.
Jerinaw
22#
 For Mac OSX I had to do tar -zcv --exclude='folder' -f theOutputTarFile.tar folderToTar Note the -f after the --exclude=
 tar -cvzf destination_folder source_folder -X /home/folder/excludes.txt  -X indicates a file which contains a list of filenames which must be excluded from the backup. For Instance, you can specify *~ in this file to not include any filenames ending with ~ in the backup.