Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Skills for Data Mining/Analysis

Options
  • 01-09-2015 1:35pm
    #1
    Registered Users Posts: 8,198 ✭✭✭funkey_monkey


    Hi,

    I'm currently being made redundant and am looking for opportunities outside of my current sector which is a niche area of development.

    I was looking at the possibillity of switching into data mining / analysis for for some of the big finance houses (or similar).

    Does anyone here work in this sector and/or can you tell me what skills (soft skills and languages/toolsets) are required to work in this area?


    Thanks.


Comments

  • Registered Users Posts: 112 ✭✭JigglyMcJabs


    Hi,

    I'm currently being made redundant and am looking for opportunities outside of my current sector which is a niche area of development.

    I was looking at the possibillity of switching into data mining / analysis for for some of the big finance houses (or similar).

    Does anyone here work in this sector and/or can you tell me what skills (soft skills and languages/toolsets) are required to work in this area?


    Thanks.

    Before you jump into toolsets and languages, I would suggest focussing on statistics, you need to have a solid base in stats first, then you could look at R and Python.

    There are a few good free courses in data analytics in the likes of NCI, DIT, DBS that you could look at too


  • Registered Users Posts: 7,410 ✭✭✭jmcc


    I don't deal with Financial data but rather with Internet data. (Domain name transactions covering about 720 top level domains and measuring website usage in TLDs and Global IP address mapping.)

    Mathematics.
    If you have an Arts background with little Mathematics, you are going to be in trouble. Apart from the Statistics suggested earlier, you will also need a working knowledge of Numerical Computations and Parallelisation.

    Thinking.
    This is not the Philosophy wánkathon stuff but something far more important. You have to be able to think in terms of data and computations. You have to be able to specify a problem, define the data you need to solve it, define the calculations necessary to solve the problem and, most importantly, know when you have solved it or made a mistake. Some Big Data problems have highly counter-intuitive solutions so when some academic with no combat experience suggests the usual textbook approach, you have to understand the data and your software well enough to know that they are talking bullsh!t. The textbook for what you are attempting may not have been written yet.

    Persistence.
    Big Data is big. It can take time to create a solution and crunch the data. And that doesn't even get into the whole ETL (Extract, Transform, Load) part of cleaning data. You have to have a long attention span (the longer the better) because problem solving at this level is not like building a toy website with a mickey mouse 100KB database.

    Know Your Tools.
    Remember the movie "Ronin" where the spoofer asked the professional which was his favourite gun? The professional responds that it's a toolbox and he uses the right weapon for the job. Well you have to have a working knowledge of most of the major tools used and know how they work. You have to know database software, hardware (very important when it comes to calculations) and analysis software. YOu also have to be capable of writing your own software and tools. (Unless you have clean data, you are going to be cleaning the data so learn about the REGEX of your favourite parsing language.)


    You can also ask this question in the Big Data forum:
    http://www.boards.ie/vbulletin/forumdisplay.php?f=1630

    Regards...jmcc


  • Closed Accounts Posts: 22,649 ✭✭✭✭beauf


    If you have the skill-set. Is it possible to get into it, without any experience of the finance sector. I would have assumed business knowledge would be a requirement?


  • Registered Users Posts: 8,198 ✭✭✭funkey_monkey


    What about the toolset - is it difficult to pick up on these?


  • Posts: 0 ✭✭✭✭ Giuliana Gorgeous Sludge


    What about the toolset - is it difficult to pick up on these?

    There are many different toolsets for many different jobs / tasks. I can speak only for myself, but I'm not aware of any overly market-leading technologies, and instead we use bits of X Y and Z everywhere.

    Tools / languages / databases that might be of interest

    Python - (SciPy stack includes an IDE and relevant packages)
    R - cran has ****tonnes of packages
    c#
    F#
    ironpython
    q

    sql
    mongodb
    riak
    kdb+

    Any familiarity with these that you might already have will probably guide you in what you should pickup first to get a step on the ladder etc.

    There's some excellent courses on Udemy / Coursera / NewBoston which are all free that can take you around some of the packages and get you up to speed quicker.

    https://www.kaggle.com/competitions is worth a glance at to see what might you might be expected to be able to work though. As far as I remember you can use some kernels on site without needing to install yourself etc.


  • Advertisement
  • Registered Users Posts: 1,017 ✭✭✭whatever76


    this is very Microsoft centric but it does give good basic concepts to Machine Learning techniques and some good labs as well.. https://studio.azureml.net/

    JMCC above summed it up perfectly - statistics is key to get you started !


Advertisement