Stupid shell/piping question (tail) - clearing header/footer

PhantomBeaker · 22-02-2007 12:25pm #1

I'm on solaris 10, using the supplied programs like head and tail etc.

What I have is a massive file that I want to do things to. So I'm going to be doing things like awk and all the rest, but what I have is about 14 lines of header and 6 lines of footer that I want to clear before I do the parsing.

Looking up the tail manpage I've found a nice little bit of syntax:

tail -5 file # Returns the last 5 lines
tail +5 file # Returns every line >= 5

Unfortunately head doesn't have that nicety.

By the way, the file is about 234k in size, and is likely to be bigger in the coming while.

So my current method of getting the file prepared (and the order doesn't matter in this, thankfully):

tail +15 file | tail -r | tail -7

Now, as I mentioned, the file's going to be getting longer than 800 lines, and running the "tail -r" is only so I can then get the middle of the file, because I can't think of a command that says "Give me everything but the last 6 lines", and tail -r has to collect everything into memory, and spew it out in reverse order.

Is there any more elegant way to do this without resorting to scripting languages like perl or python?

Thanks.
Æ

Talon.ie · 22-02-2007 6:28pm

This command will remove the first 14 lines and the last 6 lines of your file and put the rest into newFile. Then it just moves newFile to file and you're done.

cat file| sed '1,14d' | sed -e :a -e '$d;N;2,6ba' -e 'P;D' > newFile; mv newFile file

There's other ways that will work but that's the first that springs to mind. If the file's going to be really big then this might not be fast enough for you, but 800 lines won't cause it a problem.

Skrynesaver · 26-02-2007 4:49pm

lop 15 off top and 5 off the tail

head -$( expr $(wc -l filename |awk '{print $1}') - 5) filename |tail +15

Snowbat · 03-03-2007 9:30am

Assuming tac is available (?), you could pipe it through tac, do your stuff with tail, then pipe through tac again to put it right-way-up.

electrofelix · 06-03-2007 2:07pm

You could use awk to skip the first 15 lines.

cat file | awk 'BEGIN { ignore_head=15; } { if (ignore_head > 0) { ignore_head--} else { print $0 } }' | tail -n +5

The awk script can be expanded to cut the last 5 lines but that would make it significantly more complex, see next.

cat file | awk 'BEGIN { ignore_head=15; ignore_tail=5; current=0; output=0; } { if (ignore_head > 0) { ignore_head--} else { if (output) {print mem[current]} mem[current]=$0; current++; if (current >= ignore_tail) {current=0; output=1; } }  }'

I haven't checked it to throughly but it should work as required and I suspect it should be reasonable quick since it doesn't require re-reading of the entire file.

I'm sure someone with a bit more awk-foo than me can shorten it more.

PhantomBeaker wrote:

Is there any more elegant way to do this without resorting to scripting languages like perl or python?

Provided the file is small enough to be held in cache, the original solution you came up with appears to be just as quick as any of the others mentioned. So I suggest that you stick with the "tail +15 file | tail -r | tail -7".

Otherwise try testing the different solutions using time and redirect the output to /dev/null and see how long the sys time is for an idea of how long each solution takes.

generalmiaow · 10-03-2007 4:32pm

electrofelix wrote:

Provided the file is small enough to be held in cache, the original solution you came up with appears to be just as quick as any of the others mentioned. So I suggest that you stick with the "tail +15 file | tail -r | tail -7".

I agree, I actually really like the OP's suggestion, a lot easier to remember for one thing.

Stupid shell/piping question (tail) - clearing header/footer

Comments