Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Python help

Options
  • 08-04-2013 1:59pm
    #1
    Closed Accounts Posts: 3,981 ✭✭✭


    Good afternoon,

    I'm having some issues with Python. I'm extremely new at Python - just picked it up a couple of days ago.

    My code does three things.
    1. Get a list of files by doing an ls.
    2. Get a list of times by doing a cat.
    3. Get a numerical value (a count) by doing greps.

    Essentially things breaks down to lots of grep|wc commands which return a number. I'm parsing logs to see how many queries our application gets every second, and I wish to store this in a csv file so I can insert it into a db. I would use the MySQL module however I don't have root access on our server and the guys who maintain it will take weeks to install the module for me. I know I can install it manually, which I tried but I end up needing a couple of .rpms before I can get the module working.

    Anyway, back on topic. Here is the code:

    #!/apps2/peter/python/python
    import os
    import sys
    import subprocess
    import shlex
    import csv
    from subprocess import check_output
    filecmd = ["ls newsperf.log.2013-04-04*"]
    cmds = []
    dates = []
    host = "myhost"
    
    c = csv.writer(open("MYFILE.csv", "wb"))
    
    #get list of files
    p1 = subprocess.Popen(filecmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, shell=True)
    fileoutput  = subprocess.check_output(filecmd, shell=True)
    files = fileoutput.split()
    
    #get list of times
    timecmd = ["cat times"]
    p3 = subprocess.Popen(timecmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, shell=True)
    timeoutput = subprocess.check_output(timecmd, shell=True)
    times = timeoutput.split()
    
    
    #for each file in the list, extract the date, create a list of commands
    for file in files:
            for time in times:
                    cmds += [["grep :" + time + " " + file + "|wc -l"]]
    
    #for each command, execute and print the value
    for cmd in cmds:
            p2 = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, shell=True)
            output = subprocess.check_output(cmd, shell=True)
            d = output.split()
            d = map(int,d)
            con=0
            for dd in d:
                    print dd
                    c.writerow(["dd"])
    

    So this code works fine up to this point.

    What I'm looking for next is to create the csv in the following format: date, time, host, value

    Date will be extracted from the file name (removing newsperf.log.), time will come from 'times', host comes from 'host' and value is the last part of the code there.

    I tried referencing times[0] and files[0] however this only shows the first character of the time and the first character of the file.

    If I iterate through each like so:
    for time in times:
                      print time
    
    It will print the entire thing.

    So my question really is how do I get the first "element" from each of the lists without a for loop?

    I was hoping times[0] would work. :o


Comments

  • Closed Accounts Posts: 3,981 ✭✭✭[-0-]


    I think I got it, this is sort of messy but it works.
    for file in files:
            filestr = file + ","
            test = filestr.split(',')
            
    print test[0]
    

    Is there a cleaner way to do this?


  • Registered Users Posts: 2,021 ✭✭✭ChRoMe


    What you are doing seems more suited to a shell script as you are just calling Linux Bash commands?


  • Registered Users Posts: 291 ✭✭Seridisand


    You could try specifying a range:
    for time in times(0, 0):
                  return time
    


  • Registered Users Posts: 4,766 ✭✭✭cython


    [-0-] wrote: »
    Good afternoon,

    I'm having some issues with Python. I'm extremely new at Python - just picked it up a couple of days ago.

    My code does three things.
    1. Get a list of files by doing an ls.
    2. Get a list of times by doing a cat.
    3. Get a numerical value (a count) by doing greps.

    Essentially things breaks down to lots of grep|wc commands which return a number. I'm parsing logs to see how many queries our application gets every second, and I wish to store this in a csv file so I can insert it into a db. I would use the MySQL module however I don't have root access on our server and the guys who maintain it will take weeks to install the module for me. I know I can install it manually, which I tried but I end up needing a couple of .rpms before I can get the module working.

    Anyway, back on topic. Here is the code:

    #!/apps2/peter/python/python
    import os
    import sys
    import subprocess
    import shlex
    import csv
    from subprocess import check_output
    filecmd = ["ls newsperf.log.2013-04-04*"]
    cmds = []
    dates = []
    host = "myhost"
    
    c = csv.writer(open("MYFILE.csv", "wb"))
    
    #get list of files
    p1 = subprocess.Popen(filecmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, shell=True)
    fileoutput  = subprocess.check_output(filecmd, shell=True)
    files = fileoutput.split()
    
    #get list of times
    timecmd = ["cat times"]
    p3 = subprocess.Popen(timecmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, shell=True)
    timeoutput = subprocess.check_output(timecmd, shell=True)
    times = timeoutput.split()
    
    
    #for each file in the list, extract the date, create a list of commands
    for file in files:
            for time in times:
                    cmds += [["grep :" + time + " " + file + "|wc -l"]]
    
    #for each command, execute and print the value
    for cmd in cmds:
            p2 = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, shell=True)
            output = subprocess.check_output(cmd, shell=True)
            d = output.split()
            d = map(int,d)
            con=0
            for dd in d:
                    print dd
                    c.writerow(["dd"])
    

    So this code works fine up to this point.

    What I'm looking for next is to create the csv in the following format: date, time, host, value

    Date will be extracted from the file name (removing newsperf.log.), time will come from 'times', host comes from 'host' and value is the last part of the code there.

    I tried referencing times[0] and files[0] however this only shows the first character of the time and the first character of the file.

    If I iterate through each like so:
    for time in times:
                      print time
    
    It will print the entire thing.

    So my question really is how do I get the first "element" from each of the lists without a for loop?

    I was hoping times[0] would work. :o

    Depending on the content of the times variable (it looks like it's a single string at the moment, hence you saying times[0] prints a character, and the loop prints the whole lot, you might need to look at times.splitlines()[0] or times.split(delimitingCharacter)[0] to divide the string into useful elements.


  • Closed Accounts Posts: 3,981 ✭✭✭[-0-]


    [-0-] wrote: »
    I think I got it, this is sort of messy but it works.
    for file in files:
            filestr = file + ","
            test = filestr.split(',')
            
    print test[0]
    

    Is there a cleaner way to do this?

    I'm all set guys - thanks. This did the trick.


  • Advertisement
Advertisement