Kapitel 1
| Data Scientist: The Sexiest Job of the 21st Century | https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century |
| Top Trends in the Gartner Hype Cycle for Emerging Technologies 2017 | http://www.gartner.com/smarterwithgartner/top-trends-in-the-gartner-hype-cycle-for-emerging-technologies-2017 |
Kapitel 2
Rollen
| The Data Science Venn Diagram | http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram |
Online-Kurse und Ressourcen
| Online-Kurse Coursera | https://www.coursera.org/ |
| Online-Kurse edx | https://www.edx.org/ |
| Online-Kurse und viele Bücher, die im O’Reilly Verlag erschienen sind | https://www.safaribooksonline.com/ |
| Online-Kurse Udacity | |
| Online-Kurse Udemy | https://www.udemy.com/ |
Weitere Informationen
| SystemML (jetzt SystemDS) | https://systemds.apache.org/ |
| DevOps, und alle ziehen an einem Strang | https://www.cloudcomputing-insider.de/devops-und-alle-ziehen-an-einem-strang-a-501139/ |
| DevOps: Schluss mit den Grenzen zwischen Entwicklung und Operations | https://de.atlassian.com/devops |
Kapitel 3
Herausforderungen
| Weshalb die meisten Big-Data-Projekte scheitern | https://www.datacenter-insider.de/weshalb-die-meisten-big-data-projekte-scheitern-a-417085/ |
| Woran Big-Data-Analysen wirklich scheitern | https://www.bigdata-insider.de/woran-big-data-analysen-wirklich-scheitern-a-677594/ |
| Berater scheitern an Data Analytics | https://www.cio.de/a/berater-scheitern-an-data-analytics,3580190 |
| Why Silicon Valley’s ‚Fail Fast‘ Mantra Is Just Hype | https://www.forbes.com/sites/robasghar/2014/07/14/why-silicon-valleys-fail-fast-mantra-is-just-hype/#170c1a6d24bc |
CRISP-DM
| The CRISP-DM User Guide | https://s2.smu.edu/~mhd/8331f03/crisp.pdf |
| Why Continuous Learning is the key towards Machine Intelligence | https://medium.com/@vlomonaco/why-continuous-learning-is-the-key-towards-machine-intelligence-1851cb57c308 |
SCRUM, KANBAN, Machine Learning Canvas
| Scrum-Einführung | http://scrum-master.de/Scrum-Einfuehrung |
| Scrum – Ein kurzer Blick auf die Verwendung des Scrum-Frameworks in der Softwareentwicklung | https://de.atlassian.com/agile/scrum |
Cloud
| Data Lakes and Analytics on AWS | https://aws.amazon.com/de/products/analytics/ |
| Azure Analysis Services | https://azure.microsoft.com/en-us/services/analysis-services/ |
| Azure Big data and analytics | https://azure.microsoft.com/en-us/solutions/big-data/ |
| IBM Analytics Services | https://www.ibm.com/cloud/analytics |
| Data and analytics services on IBM cloud | https://www.ibm.com/cloud/data |
| IBM Watson Studio | https://www.ibm.com/de-de/cloud/watson-studio |
| Comparing Machine Learning as a Service: Amazon, Microsoft Azure, Google Cloud AI | https://www.kdnuggets.com/2018/01/mlaas-amazon-microsoft-azure-google-cloud-ai.html |
| Simple Monthly Calculator | http://calculator.s3.amazonaws.com/index.html |
Weitere Informationen
| Was ist Business Intelligence – BI? | https://www.bigdata-insider.de/was-ist-business-intelligence-bi-a-563185/ |
| Data Science Plattform Kaggle | https://www.kaggle.com/ |
Kapitel 4
| h | |
https://tdwi.org/articles/2017/02/08/10-vs-of-big-data.aspx | |
| Smart Data Newsletter | https://www.digitale-technologien.de/DT/Redaktion/DE/Downloads/Publikation/SmartData_NL1.pdf?__blob=publicationFile&v=5 |
| Aktuelles Schlagwort “Semi-strukturierte Daten” | https://www.en.pms.ifi.lmu.de/publications/PMS-FB/PMS-FB-2001-9.pdf |
| The world’s most valuable resource is no longer oil, but data | https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data |
| Is Your Company’s Data Actually Valuable in the AI Era? | https://hbr.org/2018/01/is-your-companys-data-actually-valuable-in-the-ai-era |
Kapitel 5
Methoden
| Reguar Expressions 101 | https://regex101.com/ |
| RegExr | https://regexr.com/ |
| Python Data Preparation Case | https://www.kdnuggets.com/2017/09/python-data-preparation-case-files-group-based-imputation.html |
| Principal Component Analysis | http://setosa.io/ev/principal-component-analysis/ |
| A One-Stop Shop for Principal Component Analysis | https://towardsdatascience.com/a-one-stop-shop-for-principal-component-analysis-5582fb7e0a9c |
| Tankerkönig | https://creativecommons.tankerkoenig.de/ |
| StatsModels – Statistics in Python | http://www.statsmodels.org/ |
| Forecasting: Principles and Practice | https://www.otexts.org/fpp/ |
| Coursera – Spezialisierung Deep Learning | https://www.coursera.org/specializations/deep-learning |
| Linear Regression: Implementation, Hyperparameters and their Optimizations | http://pavelbazin.com/post/linear-regression-hyperparameters/#linear-regression-implementation-hyperparameters-and-their-optimizations |
| ScitKitLearn – Logistic Regression | http://scikit-learn.org/stable/modules/linear_model.html#logistic-regression |
| What are kernels in machine learning and SVM and why do we need them? | https://www.quora.com/What-are-kernels-in-machine-learning-and-SVM-and-why-do-we-need-them |
| Mahalanobis-Distanz | http://www.statistics4u.com/fundstat_germ/ee_mahalanobis_distance.html |
| Distance and Similarity Measures Effect on the Performance of K-Nearest Neighbor Classifier – A Review | https://arxiv.org/pdf/1708.04321.pdf |
| The distance function effect on k-nearest neighbor classification for medical datasets | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4978658/ |
| Introduction to k-Nearest Neighbors: Simplified (with implementation in Python) | https://www.analyticsvidhya.com/blog/2018/03/introduction-k-neighbours-algorithm-clustering/ |
| 6 Easy Steps to Learn Naive Bayes Algorithm (with codes in Python and R) | https://www.analyticsvidhya.com/blog/2017/09/naive-bayes-explained/ |
| Python for Image Understanding: Deep Learning with Convolutional Neural Nets | https://www.slideshare.net/roelofp/python-for-image-understanding-deep-learning-with-convolutional-neural-nets |
| 10 Ways Machine Learning Is Revolutionizing Manufacturing In 2018 | https://www.forbes.com/sites/louiscolumbus/2018/03/11/10-ways-machine-learning-is-revolutionizing-manufacturing-in-2018/#2a267fa023ac |
| Machine Learning in Manufacturing – Present and Future Use-Cases | https://emerj.com/ai-sector-overviews/machine-learning-in-manufacturing/ |
| The Neural Network Zoo | http://www.asimovinstitute.org/neural-network-zoo/ |
| Top 8 Free Must-Read Books on Deep Learning | https://www.kdnuggets.com/2018/04/top-free-books-deep-learning.html |
| Deep Learning – An MIT Press book | http://www.deeplearningbook.org |
| Neural Networks and Deep Learning | http://neuralnetworksanddeeplearning.com/ |
| Neural Networks and Learning Machines (Third Edition) | https://cours.etsmtl.ca/sys843/REFS/Books/ebook_Haykin09.pdf |
| Selecting the number of clusters with silhouette analysis on KMeans clustering | https://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_silhouette_analysis.html |
Datenbanken und ETL-Tools
| w3schools.com – SQL Tutorial | https://www.w3schools.com/sql/ |
| List of NoSQL Databases | http://nosql-database.org/ |
| The history of Hadoop | https://medium.com/@markobonaci/the-history-of-hadoop-68984a11704 |
| Hadoop Tutorial: All you need to know about Hadoop! | https://www.edureka.co/blog/hadoop-tutorial/ |
| Download Cloudera Enterprise | https://www.cloudera.com/downloads.html |
| Downloads für Connected-Data-Plattformen von Hortonworks | |
| Safari Books Online | https://www.safaribooksonline.com/ |
| Talend | https://de.talend.com/ |
| Informatica | https://www.informatica.com/de |
| Apache Kafka | http://kafka.apache.org/ |
| Confluent | https://www.confluent.io/ |
| Nifi | https://nifi.apache.org/ |
| Hortonworks Data Platform | –> Hortonworks wurde in der Zwischenzeit Teil von Cloudera |
| Apache Spark Streaming | https://spark.apache.org/streaming/ |
Analytics-Tools
Weiterführende Informationen
| KDNuggets | https://www.kdnuggets.com/ |
| Data Science Central | https://www.datasciencecentral.com/ |
| Towards Data Science | https://towardsdatascience.com/ |
Kapitel 6
Process Mining
| Interview – Process Mining ist ein wichtiger Treiber der Prozessautomatisierung | https://data-science-blog.com/blog/2017/10/19/interview-prof-scheer-process-mining-automation/ |
| Celonis | https://www.celonis.com/de/ |
| Fluxicon | https://fluxicon.com/disco/ |
| Process Mining | |
| Dataset – Production Analysis with Process Mining Technology | https://data.4tu.nl/articles/dataset/Production_Analysis_with_Process_Mining_Technology/12697997 |
| ProM Tools | |
| Online Course: Introduction to Process Mining with ProM | https://www.futurelearn.com/courses/process-mining |
| Alpha Miner | https://www.futurelearn.com/courses/process-mining/0/steps/15637 |
Berichte
Wartung
| automotiveIT: Predictive Maintenance enttäuscht Erwartungen | https://www.automotiveit.eu/predictive-maintenance-enttaeuscht-erwartungen/news/id-0060652 |
| 6. Turbofan Engine Degradation Simulation Data Set | |
| Getting Started with Predictive Maintenance Models | https://www.svds.com/getting-started-predictive-maintenance-models/ |
| Predictive Maintenance for IoT | https://www.svds.com/predictive-maintenance-iot/ |
| Data analysis and processing techniques for remaining useful life estimations | https://rdw.rowan.edu/cgi/viewcontent.cgi?article=3433&context=etd |
| GitHub – Apache Spark – Turbofan Engine Degradation Simulation Data Set example in Apache Spark | https://github.com/oluies/tedsds |
Transporte
| Bureau Of Transportation Statistics: Freight Analysis Framework | https://www.bts.gov/faf |
| Seaborn: statistical data visualization | https://seaborn.pydata.org/ |
| Geopy | https://geopy.readthedocs.io/en/stable/ |
| Great Circle Maps for Python | https://github.com/paulgb/gcmap |
Letzter Check: 21.09.2025