Unix find average file size

I have a directory with a ton of files I want to find the average file size of these files so something like ls somethinghere whats the average file size of everything meets that?

回答1:

I found something here:
http://vivekjain10.blogspot.com/2008/02/average-file-size-within-directory.html

To calculate the average file size within a directory on a Linux system, following command can be used:

ls -l | gawk '{sum += $5; n++;} END {print sum/n;}'

回答2:

A short, general and recursion-friendly variation of Ernstsson's answer:

find ./ -ls | awk '{sum += $7; n++;} END {print sum/n;}'

Or, for example, if you want to impede files above 100 KB from stewing the average:

find ./ -size -100000c -ls | awk '{sum += $7; n++;} END {print sum/n;}'

回答3:

Use wc -c * to get the size of all the files and ls | wc -l to get the number of files. Then just divide one by the other.

回答4:

du -sh . # gives the total space used by the directory

find . -type f | wc -l # count the number of files

devide the first by the second. If you want a one liner, here it is:

echo $(( `du -sb | tr '.' ' '` / `find . -type f | wc -l` ))

回答5:

They are finding the size of a directory and finding the amount of free disk space that exists on your machine. The command you would use to find the directory size is ' du '. And to find the free disk space you could use ' df '.

All the information present in this article is available in the man pages for du and df. In case you get bored reading the man pages and you want to get your work done quickly, then this article is for you.

'du' - Finding the size of a directory

$ du

Typing the above at the prompt gives you a list of directories that exist in the current directory along with their sizes. The last line of the output gives you the total size of the current directory including its subdirectories. The size given includes the sizes of the files and the directories that exist in the current directory as well as all of its subdirectories. Note that by default the sizes given are in kilobytes.

**$ du /home/david**

The above command would give you the directory size of the directory /home/david

**$ du -h**

This command gives you a better output than the default one. The option '-h' stands for human readable format. So the sizes of the files / directories are this time suffixed with a 'k' if its kilobytes and 'M' if its Megabytes and 'G' if its Gigabytes.

**$ du -ah**

This command would display in its output, not only the directories but also all the files that are present in the current directory. Note that 'du' always counts all files and directories while giving the final size in the last line. But the '-a' displays the filenames along with the directory names in the output. '-h' is once again human readable format.

**$ du -c**

This gives you a grand total as the last line of the output. So if your directory occupies 30MB the last 2 lines of the output would be

30M . 30M total

The first line would be the default last line of the 'du' output indicating the total size of the directory and another line displaying the same size, followed by the string 'total'. This is helpful in case you this command along with the grep command to only display the final total size of a directory as shown below.

**$ du -ch | grep total**

This would have only one line in its output that displays the total size of the current directory including all the subdirectories.

Note : In case you are not familiar with pipes (which makes the above command possible) refer to Article No. 24 . Also grep is one of the most important commands in Unix. Refer to Article No. 25 to know more about grep.

**$ du -s**

This displays a summary of the directory size. It is the simplest way to know the total size of the current directory.

**$ du -S**

This would display the size of the current directory excluding the size of the subdirectories that exist within that directory. So it basically shows you the total size of all the files that exist in the current directory.

**$ du --exculde=mp3**

The above command would display the size of the current directory along with all its subdirectories, but it would exclude all the files having the given pattern present in their filenames. Thus in the above case if there happens to be any mp3 files within the current directory or any of its subdirectories, their size would not be included while calculating the total directory size.

'df' - finding the disk free space / disk usage

$ df

Typing the above, outputs a table consisting of 6 columns. All the columns are very easy to understand. Remember that the 'Size', 'Used' and 'Avail' columns use kilobytes as the unit. The 'Use%' column shows the usage as a percentage which is also very useful.

**$ df -h**

Displays the same output as the previous command but the '-h' indicates human readable format. Hence instead of kilobytes as the unit the output would have 'M' for Megabytes and 'G' for Gigabytes.

Most of the users don't use the other parameters that can be passed to 'df'. So I shall not be discussing them.

I shall in turn show you an example that I use on my machine. I have actually stored this as a script named 'usage' since I use it often.

Example :

I have my Linux installed on /dev/hda1 and I have mounted my Windows partitions as well (by default every time Linux boots). So 'df' by default shows me the disk usage of my Linux as well as Windows partitions. And I am only interested in the disk usage of the Linux partitions. This is what I use :

**$ df -h | grep /dev/hda1 | cut -c 41-43**

This command displays the following on my machine

45%

Basically this command makes 'df' display the disk usages of all the partitions and then extracts the lines with /dev/hda1 since I am only interested in that. Then it cuts the characters from the 41st to the 43rd column since they are the columns that display the usage in % , which is what I want.

There are a few more options that can be used with 'du' and 'df' . You could find them in the man pages.

回答6:

This works portably, even on AIX. Outputs average number of bytes for plain files in the specified directory (${directory} in the example below):

find "${directory}" '!' -path "${directory}" -prune -type f -ls | awk '{s+=$7} END {printf "%.0f\n", s/NR}'

No need in counting the number of files yourself. NR is an awk builtin for number of rows.

The '!' -path ${directory} -prune part is a portable way to achieve the equivalent of GNU find -maxdepth 1 by pruning any path that is not the same as the one we start at, thereby ignoring any subdirectories.

Adjust with restrictions on what files to count. For instance, to average all files except *.sh in the current directory, you could add '!' -name '*.sh':

find . '!' -path . -prune -type f '!' -name '*.sh' -ls | awk '{s+=$7} END {printf "%.0f\n", s/NR}'

or to count only *.mp3 and include all subdirectories (remove '!' -path . -prune):

find . -type f -name '*.mp3' -ls | awk '{s+=$7} END {printf "%.0f\n", s/NR}'