# Category Archives: Data Science

## QA – Testing Features of Machine Learning Models

In this post, you will learn about different types of test cases which you could come up for testing features of the data science/machine learning models. Testing features are one of the key set of QA tasks which needed to be performed for ensuring the high performance of machine learning models in a consistent and sustained manner. Features make the most important part of a machine learning model. Features are nothing but the predictor variable which is used to predict the outcome or response variable. Simply speaking, the following function represents y as the outcome variable and x1, x2 and x1x2 as predictor variables. y = a1x1 + a2x2 + a3x1x2 + e In the above function, …

## QA of Machine Learning Models with PDCA Cycle

The primary goal of establishing and implementing Quality Assurance (QA) practices for machine learning/data science projects or, projects using machine learning models is to achieve consistent and sustained improvements in business processes making use of underlying ML predictions. This is where the idea of PDCA cycle (Plan-Do-Check-Act) is applied to establish a repeatable process ensuring that high-quality machine learning (ML) based solutions are served to the clients in a consistent and sustained manner. The following diagram represents the details. The following represents the details listed in the above diagram. Plan Explore/describe the business problems: In this stage, product managers/business analyst sit with data scientist and discuss the business problem at hand. The outcome of this …

## QA & Data Science – How to Test Features Relevance

In this post, I intend to present a perspective on the need for QA / testing team to test the feature relevance when testing the machine learning models as part of data science QA initiatives, and, different techniques which could be used to test or perform QA on feature relevance. Feature relevance can also be termed as feature importance. Simply speaking, a feature is said to be relevant or important if it adds real predictive value to the underlying model. The relevant features must display a stable statistical relationship or association with the outcome variable. Well, an association does not imply a causation. However, a relevant feature or a feature …

## Quality Assurance / Testing the Machine Learning Model

This is the first post in the series of posts related to Quality Assurance & Testing Practices and Data Science / Machine Learning Models which I would release in next few months. The goal of this and upcoming posts would be to create a tool and framework which could help you design your testing/QA practices around data science/machine learning models. Why QA Practices for testing Machine Learning Models? Are you a test engineer and want to know about how you could make difference in AI initiative being undertaken by your current company? Are you a QA manager and looking for or researching tools and frameworks which could help your team perform QA with …

## Difference between Frequentist vs Bayesian Probability

In this post, you will learn about the difference between Frequentist vs Bayesian Probability. It is of utmost important to understand these concepts if you are getting started with Data Science. What is Frequentist Probability? The probability of occurrence of an event, when calculated as a function of the frequency of the occurrence of the event of that type, is called as Frequentist Probability. For example, the probability of rolling a dice (having 1 to 6 number) and getting a number 3 can be said to be Frequentist probability. Consider another example of head occurring as a result of tossing a coin. Note that the Frequentist frequencies can be calculated by conducting the experiment in …

## AI – Three Different types of Machine Learning Algorithms

This post is aimed to help you learn different types of machine learning algorithms which forms the key to artificial intelligence (AI). Machine learning algorithms Representation or Feature learning algorithms Deep learning algorithms The following represents different types of learning algorithms in form of a Venn diagram. What are Machine Learning (ML) Algorithms? Machine learning algorithms are the most simplistic class of algorithms when talking about AI. ML algorithms are based on the idea that external entities such as business analysts and data scientists need to work together to identify the features set for building the model. The ML algorithms are, then, trained to come up with coefficients for each of the features and how are they …

## 8 Machine Learning Javascript Frameworks to Explore

Javascript developers tend to look out for Javascript frameworks which can be used to train machine learning models based on different machine learning algorithms. The following are some of the machine learning algorithms using which models can be trained using different javascript frameworks listed in this article: Simple linear regression Multi-variate linrear regression Logistic regression Naive-bayesian K-nearest neighbour (KNN) K-means Support vector machine (SVM) Random forest Decision tree Feedforward neural network Deep learning network In this post, you will learn about different Javascsript framework for machine learning. They are some of the following: Deeplearn.js Propel ConvNetJS ML-JS KerasJS STDLib Limdu.js Brain.js DeepLearn.js Deeplearn.js is an open-source machine learning Javascript library …

## Machine Learning – Validation Techniques (Interview Questions)

Validation techniques in machine learning are used to get the error rate of the ML model which can be considered as close to the true error rate of the population. In case the data volume is large enough to be representative of the population, you may not need the validation techniques. However, in real world scenario, we work with the sample of data which may not be the true representative of the population. This is where validation techniques come into the picture. In this post, you will briefly learn about different validation techniques such as following and also presented with practice test having questions and answers which could be used …

## Data Science – What are Machine Learning (ML) Models?

Machine learning (ML) models is the most commonly used in a data science project. In this post, you will learn about different definitions of a machine learning model to get a better understanding of what are machine learning models? A model is the relationship between features and the label. (Tensorflow – Getting Started for ML Beginners) An ML model is a mathematical model that generates predictions by finding patterns in your data. (AWS ML Models) ML Models generate predictions using the patterns extracted from the input data (Amazon Machine learning – Key concepts) Learning in the supervised model entails creating a function that can be trained by using a training …

## 10+ Key Stages of Data Science Project Life cycle

Data science projects need to go through different project lifecycle stages in order to become successful. In each of the stages, different stakeholders get involved as like in a traditional software development lifecycle. In this post, you will learn some of the key stages/milestones of data science project lifecycle. This article is aimed to help some of the following project stakeholders who play key roles in data science project implementation: Product managers Project managers ML architects The following represents 6 high-level stages of data science project lifecycle: Planning Model development & testing Product-level changes Model deployment Monitoring the model Model Enhancement Data Science Project Lifecycle – Planning ML Problem identification: …

## Decision Tree Algorithm – Concepts, Interview Questions

Decision tree is one of the most commonly used machine learning algorithms which can be used for solving both classification and regression problems. It is very simple to understand and use. Here is a lighter one representing how decision trees and related algorithms (random forest etc) are agile enough for usage. In this post, you will learn about some of the following in relation to machine learning algorithm – decision trees vis-a-vis one of the popular C5.0 algorithm used to build a decision tree for classification. In another post, we shall also be looking at CART methodology for building a decision tree model for classification. Key terminologies/definitions Key concepts Python …

## Tutorials – Building Machine Learning Models for Predicting Cancer

In this article, I would introduce different aspects of the building machine learning models to predict whether a person is suffering from malignant or benign cancer while emphasizing on how machine learning can be used (predictive analysis) to predict cancer disease, say, Mesothelioma Cancer. The approach such as below can as well be applied to any other diseases including different types of cancers. Predicting Mesothelioma Cancer – Supervised Learning Problem Machine learning problems are classified into different kinds of learning problem. Most important of them are following: Supervised learning Unsupervised learning Supervised Learning In supervised learning, you have a history of data with each record being labeled. Thus, in case of predictive analysis of Mesothelioma cancer, there is …

## Top 8 Neural Networks and Deep Learning Tutorials

Here is a list of top 8 neural networks tutorials (web pages) for getting started on neural networks and deep learning. Introduction to Deep Neural Networks Neural Networks and Deep Learning: Free online book to learn concepts related with neural networks and deep learning. Very good for beginners. Concepts explained using Handwritten digits. The book is authored by Michael Nielsen. Neural Networks: The page explains and demonstrates various types of neural networks along with applications of neural networks like ANNs in medicine. Coursera Course on Neural Networks for Machine Learning: This can be used to learn fundamentals related with artificial neural networks and how they’re being used for machine learning, …

## Neural Networks Interview Questions – Set 1

This page represents practice test consisting of objective questions on neural networks. This test can prove to be useful for interviews as well. These questions can prove to be useful for machine learning interns / freshers / beginners. These questions are related with some of the following topics: Introduction to neural networks Perceptron / Sigmoid neuron Types of neural networks Cost function for neural networks Practice Test on Neural Networks

## 70 Regression Analysis Interview Questions & Practice Tests

This page lists down practice tests (questions and answers), links to PDF files (consisting of interview questions) on Linear / Logistic Regression for machine learning / data scientist enthusiasts. These questions can prove to be useful, especially for machine learning / data science interns / freshers / beginners to check their knowledge from time-to-time or for upcoming interviews. Practice Tests on Linear / Multilinear Regression These are a set of four practice tests (consisting of 40 questions) covering linear (univariate) and multilinear (multivariate) regression in detail. Linear, Multiple regression interview questions and answers – Set 1 Linear, Multiple regression interview questions and answers – Set 2 Linear, Multiple regression interview …

## Logistic Regression Interview Questions & Practice Tests

This page lists down a set of 30 interview questions on Logistic Regression (machine learning/data science) in form of objective questions and also provides links to a set of three practice tests that would help you test / check your knowledge on an ongoing basis. These questions and practice tests are intended to primarily help interns/freshers/beginners to help them brush up their knowledge in logistic regression from time to time. The following is a list of topics covered on this page. Introduction to logistic regression Logistic regression examples Evaluating performance of logistic regression and related techniques including AIC, deviance, ROC etc. Difference between linear and logistic regression Here is another post on …