Problem solving

Training to be a Pen Tester is very different to doing the job of a Pen Tester. No one, who is in the job can really give a new learner any advice about Pen Testing because each company may do things a little different than another company.

This was a conclusion I came to only after I became a junior tester. Not only that, there’s a ton of information to get through, so your head is pretty much like mush for months, learning new things everyday.


I came across a time management problem recently. I needed to extract 187 IP addresses from the output of a tool in a HTML document. Why? – I hear you ask! I needed to format the IP addresses so that excel would align them correctly in a cell.

I didn’t have time to copy each IP address individually and paste it into excel in the format required.

Output data (for example) (tcp/443)
Apache Web Server (tcp/443)
Apache Web Server (tcp/443)
Apache Web Server

Excel expected data;;;

Formatting this data by way of copy and paste was not an option.


The Linux command line interface is probably one, or thee most important tools to a tester (Only just realising this now).

Tasks I needed to perform:

  • Cut the IP address from the text
  • Add a semi-colon to the end of each IP address
  • Produce a long string that could be copied into Excel.

One Liner

Copying all the data from the HTML output and pasting it into a file was a good start. I then, through trial and error, managed to get my output to the point to where it was usable by using this command string.

cat file.txt | grep 192 | cut -f1 -d “(” | sed ‘s/$/;/’ |tr -d ‘ ‘ | xargs

The tr -d ‘ ‘ was only for mac because using the sed substitution command for some reason inserted a space between the end of the IP address and the semi-colon. On Linux this doesn’t happen.

This command string turned my data into the required Excel format.


Some may wonder why this even a blog post. It’s low level simple data manipulation. Yes, that’s true, however, it’s a side of the job that no one really thinks about. Time sensitive data manipulation on a large scale is required so it’s best to get used to it.

During OSCP, you’re only ever really working on one or two boxes at a time. Think larger, think hundreds of IP addresses and how you would cope with that load.