Monday, August 15, 2016

9 Must Have Skills to be a Data Scientist

1. Programming


Python - Learning python is a fun. http://pythonprogramminglanguage.com/

IPython Notebook - http://ipython.org/notebook.html


2. Probability and Statistics 

The connection between Data Mining and Statistics is beautifully explained by Jerome H. Friedman. http://statweb.stanford.edu/~jhf/ftp/dm-stat.pdf

R - A language for statistical computing and plotting. https://www.r-project.org/

SymPy - http://www.sympygamma.com

Statsmodels - http://statsmodels.sourceforge.net/

Statistics Interactive Course - http://onlinestatbook.com/index.html


3. Machine Learning Algorithms & Libraries


Scikit Learn - http://scikit-learn.org/stable/

Scipy - http://scipy.org/scipylib/index.html

Scikit-Image - A collection of algorithms for image processing in Python

PyBrain - http://pybrain.org


4. Data Structures


Csvkit - https://csvkit.readthedocs.org

Pandas - http://pandas.pydata.org

NumPy - http://www.numpy.org

Theano - http://deeplearning.net/software/theano/library/tensor/basic.html


5. Information Extraction


Scrapy - http://scrapy.org/

BeautifulSoap -https://pypi.python.org/pypi/beautifulsoup4


6. Natural Language Processing


NLTK - http://www.nltk.org/

Pattern- https://pypi.python.org/pypi/Pattern


7. Databases



Wide Column Store/Column Families

HBase
Cassandra
Hypertable

Key-Value/Tuple Store

Couchbase Server
Voldemort
Cassandra
MemcacheDB
Amazon DynamoDB

Document Store

MongoDB
CouchDB

Graph Database

InfoGrid

8. Big Data & Distributed Computing


Amazon's EC2 - https://aws.amazon.com/ec2/

Map Reduce - https://en.wikipedia.org/wiki/MapReduce

Apache Hadoop - A software for distributed computing. http://hadoop.apache.org/

Apache Mahout - http://mahout.apache.org/

Apache Spark - http://spark.apache.org/

Apache Whirr - https://whirr.apache.org/

Hive - https://hive.apache.org/

Zookeeper - https://zookeeper.apache.org/


9. Visualization


Matplotlib - http://matplotlib.org

Bokeh - http://bokeh.pydata.org/en/latest/

D3.js https://d3js.org/



No comments:

Post a Comment