Sai Geetha M N

- Jun 25, 2021
- 6 min

Data Scientists, Data Engineers, ML Engineers And More - Demystified

As the world of Big Data, Machine Learning and Artificial Intelligence is taking off, there is an overlap of roles and responsibilities...

Sai Geetha M N

- Jun 18, 2021
- 13 min

HBase Design - Guidelines & Best Practices

We have looked at HBase Fundamentals and HBase Architecture in the last two weeks. Today I will look at a few best practices and...

Sai Geetha M N

- Jun 10, 2021
- 7 min

HBase Architecture

We looked at the basics of HBase in the previous article, last week. Today we will understand the Architecture of HBase. We all agree...

Sai Geetha M N

- Jun 3, 2021
- 9 min

HBase Fundamentals

HBase is a NoSQL DB that uses some capabilities of the Hadoop ecosystem to provide its features. NoSQL DBs (a.k.a Not Only SQL) are...

Sai Geetha M N

- May 27, 2021
- 5 min

MultiCollinearity

Multicollinearity is a concept relevant to all the input data that is used in a Machine learning Algorithm. This has to be understood...

Sai Geetha M N

- May 20, 2021
- 4 min

Outliers and their treatment

Outliers in data analysis and data preparation are to be considered in specific ways so that the data that is fed to a machine learning...

Sai Geetha M N

- May 13, 2021
- 5 min

Feature Scaling and its Importance

Feature Scaling is a very important aspect of data preparation for many Machine Learning Algorithms. Let us understand what is feature...

Sai Geetha M N

- May 6, 2021
- 10 min

Linear Regression Through Code - Part 2

Last week, in Part 1, I walked through all the preliminary steps to be done before you can build a Linear Regression model. This week we...

Sai Geetha M N

- Apr 29, 2021
- 8 min

Linear Regression Through Code - Part 1

#Tutorial In an earlier blog post, I have spoken about "What is Regression?" and the basic linear equation too. This is one of the...

Sai Geetha M N

- Apr 28, 2021
- 2 min

Types of Variables - Definition

#Definition There are different characteristics of data that are used for analysis and machine learning. One very fundamental...

Sai Geetha M N

- Apr 22, 2021
- 9 min

Data Validation - During Ingestion into Data Lake

Any enterprise that wants to harness the power of data, almost always begins with building a data lake. By definition, a data lake is a...

Sai Geetha M N

- Apr 17, 2021
- 1 min

ACID Vs BASE - A definition

#Definitions ACID is a characteristic of RDBMS databases Atomic: Each task in a transaction succeeds or the entire transaction is rolled...

Sai Geetha M N

- Apr 16, 2021
- 4 min

Making the Right Database Choice

#ArchitecturalDecision If someone were to ask, should I use SQL or NoSQL database, the obvious answer is "it depends". Depends on what?...

Sai Geetha M N

- Apr 8, 2021
- 2 min

Regression Algorithms

#ExecutiveSummary #MLModels What is Regression? Regression is a statistical model/method used to determine the strength and character of...

Sai Geetha M N

- Apr 1, 2021
- 6 min

Machine Learning Process - A Success Recipe

#MachineLearningProcess #ExecutiveSummary Introduction It is said that "The world's most valuable resource is no longer oil, but data"....

Sai Geetha M N

- Mar 19, 2021
- 4 min

Hadoop for Analysts - Apache Druid, Apache Kylin and Interactive Query Tools

#ToolComparison #ArchitectureDecision Introduction Traditional Data Warehouses have existed in the industry for quite some time now. They...

Sai Geetha M N

- Mar 16, 2021
- 4 min

Machine Learning Algorithms Categories

Machine Learning Algorithms learn from data as humans learn from experience. But the type of learning and the goal varies from algorithm...

Sai Geetha M N

- Mar 10, 2021
- 3 min

The Machine Learning Landscape

If you are looking to start learning about the basics of Machine learning, you are at the right place. My blog will cover overviews of...

