Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi all! We have been experiencing an issue on site where threads have been missing the latest postings. The platform host Vanilla are working on this issue. A workaround that has been used by some is to navigate back from 1 to 10+ pages to re-sync the thread and this will then show the latest posts. Thanks, Mike.
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Help correlating two files

  • 04-03-2008 10:03am
    #1
    Registered Users, Registered Users 2 Posts: 1,227 ✭✭✭


    Hello All,

    I have a bit of a problem....

    I have two files File_A and File_B

    File A containes information in the format

    Value|number|value|value|value
    Value|number|value|value|value
    Value|number|value|value|value
    and so on for about 90 thousand lines

    File B is in the format

    Number|value|value|value
    Number|value|value|value
    Number|value|value|value
    Number|value|value|value
    and so on for about 40 thousand lines

    What I want to do is run a command to see if any of the numbers in File_A are also present in File_B. Is something like this possible? Maybe something like sort to take out colum 2 from File_A and sort agin to take out colum 1 of File_B to another file? Then perhaps use something like diff that doesn't care what oder the files are in?

    Any help here is very much appreciated.

    Regards,
    Steve


Comments

  • Registered Users, Registered Users 2 Posts: 37,485 ✭✭✭✭Khannie


    can you give us some actual sample lines? I'll try to write you a perl script to convert one format to the other.


  • Registered Users, Registered Users 2 Posts: 1,227 ✭✭✭stereo_steve


    Thanks for your help but I can't give any sample data. I would loose my job!

    I don't want to touch the original files or change their format. I just want to see if a number in File_A is also present somewhere in File_B and how many times this occurs.

    I thought this might be possible using the command line piping information from one command to another. I just didn't know how. There mightn't be a solution?


  • Registered Users, Registered Users 2 Posts: 2,032 ✭✭✭colm_c


    Should be handy enough to do, but you would have to write some kind of script to do it.

    Easiest way would be load it into an array, and then do a compare and output the results.

    With 40 thousand lines - it would be more efficient to load it into a db of some sort temporarily at least and run your queries against it - it would also give you an easy option of doing it in batches as with that much comparing your script will either time-out or kill your machine in the process!


  • Registered Users, Registered Users 2 Posts: 2,621 ✭✭✭GreenHell


    That should be easy enough to do with the split function in perl, if you can export those files to csv your laughing.


  • Closed Accounts Posts: 1,444 ✭✭✭Cantab.


    Yep, Perl is your friend.
    my $filename1='/home/your/file/file1.txt';
    my $filename2='/home/your/file/file2.txt';
    open(IP_FILE_HANDLER1,"< $filename1");
    open(IP_FILE_HANDLER2,"< $filename2");
    my @lines1=<IP_FILE_HANDLER1>;
    my @lines2=<IP_FILE_HANDLER2>;
    
    foreach my $line (@lines1)
    {
        chop($line);
        my @array=split(/|/,$line);
        etc...
    }
    
    foreach my $line (@lines2)
    {
        chop($line);
        my @array=split(/|/,$line);
        etc...
    }
    close(IP_FILE_HANDLER1);
    close(IP_FILE_HANDLER2);
    

    Also have a look at Sort::Fields if you want to sort columns...


  • Advertisement
Advertisement