Posts

Outliers: Selection vs. Detection

Image
  A common method for detecting fraud is to look for outliers in data. It’s a fair approach: even if the detection doesn’t immediately imply fraud it can be a good candidate for further investigation. Still, how might we go about selecting hyper-parameters (or even the algorithm)? The hard part is that we have very little to go on. Just like clustering there’s no label. It is incredibly though to argue if a certain model is appropriate for a use-case. Luckily there’s a small trick that can help. How about we try to find outliers that simply correlate with fraudulent cases? It might be surprise to find out that scikit learn has support for this but it occurs via a slightly unusual pattern. Setup I will demonstrate an approach using  this  dataset from kaggle. It’s an unbalanced dataset meant for a fraud usecase. import numpy as np import pandas as pd import matplotlib.pylab as plt df = pd.read_csv( "creditcard.csv" ).rename( str .lower, axis = 1 ) X, y = df.dr...

Why Python for Web Development

Image
  The options to develop web apps these days are so many that it would take tens of thousands of words to list and describe each one of them. Languages like Java, JavaScript, C#, and Python are amongst the most famous for the purpose of developing web apps. In this article, I will discuss some of the benefits of using Python, specifically, for the development of web apps. Easy to learn Python is one of the easiest languages to learn. If you are an experienced developer, you can learn enough Python in a week to be dangerous and do a lot. If you are a complete newbie, Python is a great first language, with a clear syntax, and allows you to get started as quickly as one can be. In any case, if you want a hand in starting out with Python, try my free  Python Guide For Beginners  to get you up to speed as fast as possible. This image from  xkcd  exemplifies this better than I ever could: Ecosystem Libraries for everything. Python has a library for every use case. Fro...

Data Visualization in Python with matplotlib andSeaborn

Image
Data visualization is an important aspect of all AI and machine learning applications. You can gain key insights of your data through different graphical representations. In this tutorial, we’ll talk about a few options for data visualization in Python. We’ll use the MNIST dataset and the Tensorflow library for number crunching and data manipulation. To illustrate various methods for creating different types of graphs, we’ll use the Python’s graphing libraries namely matplotlib, Seaborn After completing this tutorial, you will know: How to visualize images in matplotlib How to make scatter plots in matplotlib, Seaborn  How to make multiline plots in matplotlib, Seaborn  Let’s get started. Preparation of scatter data In this post, we will use matplotlib, seaborn, and bokeh. They are all external libraries need to be installed. To install them using  pip , run the following command: 1 pip install matplotlib seaborn bokeh For demonstration purposes, we will also use the MNIS...