Hier finden Sie die Links zu allen Onlineressourcen, die im Buch referenziert oder anderweitig angegeben wurden.
Stand: 19.07.2019
Kapitel 1
Data Scientist: The Sexiest Job of the 21st Century | https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century |
Top Trends in the Gartner Hype Cycle for Emerging Technologies 2017 | http://www.gartner.com/smarterwithgartner/top-trends-in-the-gartner-hype-cycle-for-emerging-technologies-2017/ |
Kapitel 2
Rollen
Close look at Data Scientist vs Data Engineer | http://www.techiexpert.com/close-look-data-scientist-vs-data-engineer/ |
The Data Science Venn Diagram | http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram |
How Do I Become a Data Scientist? | https://advanceddataanalytics.net/2015/05/12/how-do-i-become-a-data-scientist/ |
The New Data Scientist Venn Diagram | https://whatsthebigdata.com/2016/07/08/the-new-data-scientist-venn-diagram/ |
Online-Kurse und Ressourcen
Online-Kurse | https://www.coursera.org/ |
Online-Kurse | https://www.edx.org/ |
Online-Kurse und viele Bücher, die im O'Reilly Verlag erschienen sind | https://www.safaribooksonline.com/ |
Online-Kurse | https://eu.udacity.com/ |
Online-Kurse | https://www.udemy.com/ |
Weitere Informationen
SystemML (jetzt SystemDS) | https://systemds.apache.org/ |
DevOps, und alle ziehen an einem Strang | https://www.cloudcomputing-insider.de/devops-und-alle-ziehen-an-einem-strang-a-501139/ |
DevOps: Schluss mit den Grenzen zwischen Entwicklung und Operations | https://de.atlassian.com/devops |
Kapitel 3
Herausforderungen
Weshalb die meisten Big-Data-Projekte scheitern | https://www.datacenter-insider.de/weshalb-die-meisten-big-data-projekte-scheitern-a-417085/ |
Woran Big-Data-Analysen wirklich scheitern | https://www.bigdata-insider.de/woran-big-data-analysen-wirklich-scheitern-a-677594/ |
Berater scheitern an Data Analytics | https://www.cio.de/a/berater-scheitern-an-data-analytics,3580190 |
BARC-Studie: Data-Preparation-Initiativen scheitern oft an Fachkräftemangel und fehlender Management-Unterstützung | https://barc.de/news/barc-studie-data-preparation-initiativen-scheitern-oft-an-fachkraftemangel-und-fehlender-management-unterstutzung |
Why Silicon Valley's 'Fail Fast' Mantra Is Just Hype | https://www.forbes.com/sites/robasghar/2014/07/14/why-silicon-valleys-fail-fast-mantra-is-just-hype/#170c1a6d24bc |
CRISP-DM
KDD, SEMMA and CRISP-DM: A parallel overview | http://recipp.ipp.pt/bitstream/10400.22/136/3/KDD-CRISP-SEMMA.pdf |
CRISP-DM 1.0 - Step-by-step data mining guide | https://www.the-modeling-agency.com/crisp-dm.pdf |
The CRISP-DM User Guide | https://s2.smu.edu/~mhd/8331f03/crisp.pdf |
Why Continuous Learning is the key towards Machine Intelligence | https://medium.com/@vlomonaco/why-continuous-learning-is-the-key-towards-machine-intelligence-1851cb57c308 |
SCRUM, KANBAN, Machine Learning Canvas
Scrum-Einführung | http://scrum-master.de/Scrum-Einfuehrung |
Scrum - Ein kurzer Blick auf die Verwendung des Scrum-Frameworks in der Softwareentwicklung | https://de.atlassian.com/agile/scrum |
Sind wir schon da? – Die Definition of Done (DOD) | https://www.scrum.de/sind-wir-schon-da-die-definition-of-done-dod/ |
KANBAN Board Simulation | http://www.kanbansim.org/ |
Cloud
Data Lakes and Analytics on AWS | https://aws.amazon.com/de/products/analytics/ |
Azure Analysis Services | https://azure.microsoft.com/en-us/services/analysis-services/ |
Azure Big data and analytics | https://azure.microsoft.com/en-us/solutions/big-data/ |
IBM Analytics Services | https://www.ibm.com/cloud/analytics |
IBM Analytics Engine | https://www.ibm.com/cloud/analytics-engine |
Data and analytics services on IBM cloud | https://www.ibm.com/cloud/data |
IBM Watson Studio (aktualisiert) | https://www.ibm.com/de-de/cloud/watson-studio |
Comparing Machine Learning as a Service: Amazon, Microsoft Azure, Google Cloud AI | https://www.kdnuggets.com/2018/01/mlaas-amazon-microsoft-azure-google-cloud-ai.html |
Simple Monthly Calculator | http://calculator.s3.amazonaws.com/index.html |
Weitere Informationen
Was ist Business Intelligence – BI? | https://www.bigdata-insider.de/was-ist-business-intelligence-bi-a-563185/ |
Data Science Plattform Kaggle | https://www.kaggle.com/ |
Kapitel 4
Die 9 V von Big Data | https://blog.qsc.de/2016/08/die-9-v-von-big-data/ |
13 V’s in Big Data (Link tot) --> Alternative: tdwi - The 10 Vs of Big Data |
http://www.godatafy.com/tag/13-vs-in-big-data/ |
Was ist Big Data? – Eine Definition mit fünf V | https://blog.unbelievable-machine.com/was-ist-big-data-definition-f%C3%BCnf-v |
Smart Data Newsletter | https://www.digitale-technologien.de/DT/Redaktion/DE/Downloads/Publikation/SmartData_NL1.pdf?__blob=publicationFile&v=5 |
Attacking Machine Learning with Adversarial Examples | https://blog.openai.com/adversarial-example-research/ |
Aktuelles Schlagwort “Semi-strukturierte Daten” | https://www.en.pms.ifi.lmu.de/publications/PMS-FB/PMS-FB-2001-9.pdf |
The world’s most valuable resource is no longer oil, but data | https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data |
Is Your Company’s Data Actually Valuable in the AI Era? | https://hbr.org/2018/01/is-your-companys-data-actually-valuable-in-the-ai-era |
Kapitel 5
Methoden
Datenbanken und ETL-Tools
Analytics-Tools
Weiterführende Informationen
KDNuggets | https://www.kdnuggets.com/ |
Data Science Central | https://www.datasciencecentral.com/ |
Towards Data Science | https://towardsdatascience.com/ |
Kapitel 6
Process Mining
Interview – Process Mining ist ein wichtiger Treiber der Prozessautomatisierung | https://data-science-blog.com/blog/2017/10/19/interview-prof-scheer-process-mining-automation/ |
Celonis | https://www.celonis.com/de/ |
Fluxicon | https://fluxicon.com/disco/ |
Process Mining | http://www.processmining.org/tools/start |
Dataset - Production Analysis with Process Mining Technology (aktualisiert) | https://data.4tu.nl/articles/dataset/Production_Analysis_with_Process_Mining_Technology/12697997 |
ProM Tools | http://www.promtools.org/doku.php |
Online Course: Introduction to Process Mining with ProM | https://www.futurelearn.com/courses/process-mining |
Alpha Miner | https://www.futurelearn.com/courses/process-mining/0/steps/15637 |
Fuzzy Miner (aktualisiert) | http://processmining.org/online/fuzzyminer |
Berichte
Data.gov - Consumer Complaint Database | https://catalog.data.gov/dataset/consumer-complaint-database |
German Stopwords | https://github.com/solariz/german_stopwords/blob/master/german_stopwords_full.txt |
nltk.stem package | http://www.nltk.org/api/nltk.stem.html |
German stemming algorithm | http://snowball.tartarus.org/algorithms/german/stemmer.html |
Stemming and Lemmatization with Python NLTK | http://text-processing.com/demo/stem/ |
corpus: Text Corpus Analysis | https://cran.r-project.org/web/packages/corpus/ |
ScitKit Learn - Logistic Regression | http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html |
ScitKit Learn - Working with data | https://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html |
Machine Learning, NLP: Text Classification using scikit-learn, python and NLTK. | https://towardsdatascience.com/machine-learning-nlp-text-classification-using-scikit-learn-python-and-nltk-c52b92a7c73a |
Wartung
Transporte
Bureau Of Transportation Statistics: Freight Analysis Framework | https://www.bts.gov/faf |
Seaborn: statistical data visualization | https://seaborn.pydata.org/ |
Geopy | https://geopy.readthedocs.io/en/stable/ |
Great Circle Maps for Python | https://github.com/paulgb/gcmap |
„