The first 3 frustrations you will encounter when migrating spark applications to AWS EMR

Here we share the first 3 frustrations we had when migrating our anomaly detection spark applications to EMR and the ways we troubleshoot them.

Read More
Who responds to marketing campaigns?

Who responds to marketing campaigns?

The overall response rate for a marketing campaign by a Portuguese bank was 11.27%. In contrast, the 198 campaigns (among 40K+) to a person who purchased a product the last time on a date when the Euribor rate (the Euro Interbank Offered Rate) is less than 0.7 had 80.30% response rate. In fact, there are hundreds more that are similar to the group. How could we detect ALL these good groups with abnormally high response rates?

Read More

Find exceptionally bad performance groups in your loan portfolio

Some of popular machine learning algorithms miss identifying exceptionally bad segments in a loan portfolio when these segments are so small compared to the population size. This post explains a way to identify every bad-performing group in data using BigData technology with the popular German Credit Data as an example.

Read More

Building an API for your credit model in 5 minutes

The last 3 posts on creating an API demonstrated that yes you can build your own API but it requires significant development resources, knowledge and time. Furthermore, once your API goes live, you have to maintain and monitor, which can drag your time significantly (you don't want to get a call while sleeping that your API suddenly stopped). This post suggests a better way to build and maintain an API using a credit model as an example.

Read More
To do or not to do, a dilemma between Cloud and On-Premise in corporations

To do or not to do, a dilemma between Cloud and On-Premise in corporations

Cloud computing and storage have become an integral part of an IT infrastructure. It is highly scalable and you only pay for what you use. So should everyone use cloud?

Read More
Factors in Machine Learning Model for Fraud Prevention

Factors in Machine Learning Model for Fraud Prevention

Machine learning has become a very popular tool to prevent fraud in online businesses but generating meaningful variables has been tough because fraudsters also learn and evolve. This post will talk about how to create a comprehensive list of variables with consistent performance in fighting online fraud.

Read More

How to scale up credit model APIs using AWS

The last post explained how to improve performance for a credit model built in R. Specifically, it showed that by using multiple Docker containers the credit model could cut down latency by two-thirds. This post will further explore ways to build a scalable and fault-tolerant infrastructure for APIs of machine learning models using Amazon Web Service (AWS).

Read More
How to scale up RESTful APIs using Docker

How to scale up RESTful APIs using Docker

Our last post talked about creating RESTful APIs for machine learning credit models in R using Plumber. A single instance of a Plumber app will break if there are many requests coming at a same time. This post presents how to scale up R RESTful APIs using Docker to solve the scalability issue.

Read More

How to create a RESTful API for a machine learning credit model in R

Even though R provides probably the most number of machine learning algorithms out there, its packages for application development are few and thus data scientists often find it difficult to push their deliverables to their organizations' production environments. Our previous post suggested creating RESTful API as a remedy to the solution and this post will explain step-by-step how to create APIs using R's open source packages.

Read More
Seeking for solutions to help programmers using different languages collaborate better

Seeking for solutions to help programmers using different languages collaborate better

Did you know that more than 200 programming languages are listed in Wikipedia? With these many languages and today's level of specialization, it is likely you organization has coders using different languages. This post outlines some of symptoms that can arise because of this multiple language situation, explores four solutions, and suggests the best one.

Read More