Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Wget with url with hash - how to process it in python

Options
  • 27-01-2020 12:57pm
    #1
    Registered Users Posts: 5,553 ✭✭✭


    I'm calling the command here for wget for url http://pypi.org/project/pip/#files
    self.run_command('("wget http://pypi.org/project/pip/\#files -O index1.html")')
    

    My log thinks that i'm running it without anything from hash onward
    2020-01-27 11:37:23,128 020776:084 INFO:  wget http://pypi.org/project/pip/
    

    I've tried it without the quotes, brackets and escape characters but get same result. Anyone have any idea?


Comments

  • Registered Users Posts: 6,236 ✭✭✭Idleater


    Have a look at urlencode


  • Registered Users Posts: 880 ✭✭✭clearz


    I'm calling the command here for wget for url http://pypi.org/project/pip/#files
    self.run_command('("wget http://pypi.org/project/pip/\#files -O index1.html")')
    

    My log thinks that i'm running it without anything from hash onward
    2020-01-27 11:37:23,128 020776:084 INFO:  wget http://pypi.org/project/pip/
    

    I've tried it without the quotes, brackets and escape characters but get same result. Anyone have any idea?
    http://pypi.org/project/pip/%23files
    
    should work. If not try curl instead of wget if it's installed

    I don’t know much about the python standard library but I’d be positive there are classes available for downloading data from the web. This would be a safer and cleaner bet than calling system apps like wget.

    The hash symbol is usually used on the client side as part of a JavaScript app so even if you get it to work, what downloads might not be what you expected.


  • Registered Users Posts: 7,157 ✭✭✭srsly78


    OP just use a raw string.

    rawstring = r"whatever"

    self.run_command(r"wget http://pypi.org/project/pip/\#files -O index1.html")


  • Registered Users Posts: 880 ✭✭✭clearz


    srsly78 wrote: »
    OP just use a raw string.

    rawstring = r"whatever"

    self.run_command(r"wget http://pypi.org/project/pip/\#files -O index1.html")


    Won't make a difference. This is not an 'issue' with python but with the wget application.

    EDIT:

    Everything related to this can be found here in the source for wget
    http://git.savannah.gnu.org/cgit/wget.git/tree/src/url.c
    To get started: Anywhere you can find the string 'fragment' in the above code is of interest

    This led me to search google for "wget fragment" which contins plenty of relevant information.


Advertisement