Pulling publicly available info into a web page

Digital_Guy · 01-01-2017 8:27pm #1

Hi all,

Last night I had the idea for a product / website that I know would be genuinely useful for people. Maybe it was the inspiration of a new year (!), but it's both something I know people would benefit from using and which I would find useful and would make a certain task much easier.

That to me is the key and a feature of the best / most successful products - someone finding a solution to a problem that they themselves had, or creating something borne from necessity.

Anyway, the idea would involve building a site that would pull in data from various sources, to form a profile page. By way of example, let's say the profiles involved football teams. So in the case of say Arsenal, the page would feature team news and updates (e.g. pulling from Google News), Twitter updates, company financials, key players, key stats, etc.

I am sure this is relatively straightforward, but does anyone have any pointers on how to approach building even a beta version of this type of site?

Thanks in advance for any tips and insights.

Graham · 01-01-2017 8:51pm

Digital_Guy wrote: »

any tips and insights.

Consider copyright implications before you even think about architecting a solution.

Digital_Guy · 01-01-2017 9:05pm

Graham wrote: »

Consider copyright implications before you even think about architecting a solution.

Haven't gotten that far and will look into it, but I doubt there would be any implications since this would be along the lines of embedding a Twitter feed on a page as millions of websites do, pulling in Google search results, etc. I'd imagine a fair use policy would apply and that would be it really.

Buford T Justice · 02-01-2017 4:48pm

Have you any preferences on language / framework?

Digital_Guy · 02-01-2017 8:27pm

Buford T Justice wrote: »

Have you any preferences on language / framework?

Hi there, no preferences as such and since I am not a developer I wouldn't have much knowledge of that area. My experience and expertise is on the digital marketing side. I have heard of JSON and figure that could play a part in pulling and organising the data?

Aswerty · 03-01-2017 11:39pm

Digital_Guy wrote: »

I am sure this is relatively straightforward, but does anyone have any pointers on how to approach building even a beta version of this type of site?

As Graham mentioned considering the copyright implications is important, but on the other hand sometimes you just have to go ahead without any assurances that what you're doing is permissible. It's important to make sure you're not harming the originators business/traffic and even better if you actually provide value to them (e.g. refer business/traffic to them). Try and act in good faith.

You can access another sites data either of three ways:

Via an API that the site provides. If a site does provide an API for their data this is an indication that they think it's in their interest disseminating their data.
Via a download the site provides. Some sites will provide certain data in a text or CSV file which you can download (in an automated way) and then pull the data from the file.
Scraping the website. Where you access the site via a program/service in the same way a person accesses the site. The program then pulls certain bits of information off the site. Scraping is a gray area - some sites are dead against being scraped and other sites don't mind it if the scraping activity is minimal.

You should probably be able to find out if the sites you're targeting provide an API or download. Chances are they won't but if they do the Gods are smiling on you. If you want to look into scraping have a look at something like Portia (to plug an Irish company) to get an idea of how you might go about scraping data.

In terms of building a beta - I imagine the most sensible thing to do is engage with a developer. If this is a passion project you might find a developer who just wants to build it with you in their spare time (i.e. they're also passionate about it as well). Chances are you'll only find someone like that among your friends, colleagues, or wider social circle since developers typically would work on their own ideas if they wanted a side project. Oh, and for this approach you'd need to make sure there's enough upfront work for you to take on so it doesn't look like you're trying to lump all the work onto someone else which can often be the case when a non-developer works with a developer (i.e. someone thinks all they need to be is the "idea guy").

If you're planning on paying someone to build it. Chances are you're looking at a non-trivial sum of money and you'd be best shopping around with a handful of freelancers and studios. Do your due diligence and ideally if you know someone who can refer a freelancer/studio that they used and are happy with that's even better.

If you're looking at this as a potential business it's worth looking into programmes such as New Frontiers if you're new to the world of starting a business. And engage with your local enterprise office to see what they can do for you. If you are looking at paying someone to build the product for you the local enterprise offices offer some fund matching grants you might be able to avail of.

biko · 05-01-2017 1:44pm

Aswerty wrote: »

Via an API that the site provides. If a site does provide an API for their data this is an indication that they think it's in their interest disseminating their data.

Via a download the site provides. Some sites will provide certain data in a text or CSV file which you can download (in an automated way) and then pull the data from the file.

I was just going to say this too. If the site(s) provide API or download it should be fine to link to them as they have provided means to do so.

Buford T Justice · 05-01-2017 7:26pm

biko wrote: »

I was just going to say this too. If the site(s) provide API or download it should be fine to link to them as they have provided means to do so.

would also make life a bit easier too, potentially

biko · 07-01-2017 2:26pm

Indeed, data will change automatically etc

Digital_Guy · 08-01-2017 7:11pm

Thanks guys! In particular Aswerty for your detailed response. I'm familiar with the concept of an API and this idea would rely on pulling data from some of the biggest sites out there, and so there should be no issues there.

Rather it would be more about how big the project would be in terms of workload, finding a developer interested in the idea, costs and monetising the idea.

I've had some good feedback on the idea, so if you are a developer, feel free to PM.

Pulling publicly available info into a web page

Comments