...
Just my blog

Blog about everything, mostly about tech stuff I made. Here is the list of stuff I'm using at my blog. Feel free to ask me about implementations.

Soft I recommend
Py lib I recommend

I'm using these libraries so you can ask me about them.

Going deeper with Python or HTMLParser and Vkontakte randomizer comes back!

Hello, for anybody who read this blog.py_vk Last time I'm trying to parse saved HTML page to get Vkontakte ids and randomly select one of them each time: here and here. Now I'll try to go deeper and use different way to extract data from life webpage without sawing it to the folder with python script. For my opinion, using some googling I should use this:

  • http://docs.python-guide.org/en/latest/scenarios/scrape/
  • http://stackoverflow.com/questions/2081586/web-scraping-with-python
  • later I will add some more KB

The small plan:

  • Add URL of parsed page:
    • to txt file - and them get it from file to python
    • to console, after python request to user
  • Get all found ids and save it list to file ids.csv
    • optionally with Name+id
    • or just id if names will produce many encoding errors
  • Get one id from the list by random choice
    • optionally save it to winner.txt
    • or just show in console result

Some inconvenience with this way:

  • Additional lib may need to be installed by user.
  • Not user-friendly
  • Need additional How-to for user

Advantages:

  • Just working
  • Almost nothing will break
  • Easiest way
  • Less time to implement

The BIG PLAN:

  • Include all bullets from small plan
  • Add this to centos webserver
  • Create a web-page with fields for URL, options, etc
  • Create a web-page with results window, field, table etc.

Some inconvenience with this way:

  • gain an HTML experience
  • gain  designer experience
  • probably MySql needed
  • probably log mechanism needed
  • May become bug-friendly
  • More time to implement
  • Need a lot of different experience from my level
  • MUCH more difficult to start with this plan

Advantages:

  • Is user friendly
  • Did not need big instructions how work with it
  • Nothing to install
  • Faster, better, smoother
  • Give a lot of experience for me