Assertion error and Theano function

Posted on Пт 03 Август 2018 in programming notes • Tagged with data, analisys, python, theano, scikit-learn, machine learning, threads, queue, semaphoreLeave a comment

Introduction

Theano is used in one of the projects I'm working on. The project is a web server and it accepts requests and process them using Theano function. When the server tries to process two simultaneous requests, it's failed with the error:

File "theano/scan_module/scan_perform.pyx", line 397, in theano.scan_module.scan_perform.perform (/home/dinara/work_projects/ds_voxrec_api/.theano/compiledir_Linux-4.4--generic-x86_64-with- Ubuntu-16.04-xenial-x86_64-3.5.2-64/scan_perform/mod.cpp:4490) AssertionError: The compute map of output 0 should contain 1 at the end of execution, not 0.

Continue reading

Titanic disaster analysis

Posted on Чт 17 Ноябрь 2016 in data analysis • Tagged with data, analisys, python, pandas, matplotlib, scikit-learn, numpy, machine learning, kaggleLeave a comment

I'm newbie at the Kaggle and I'm new to machine learning. I'll try to make this exploration interesting and detailed.

1. Data analysis

1.1. Expectations

What I do expect from this analysis? I’ll create a model predicting a survival on the Titanic. And on the way to prediction I'll make illustrations for all found dependencies.
First of all, I want to understand what kind of variables do I have.
Continue reading


Future stock prices prediction based on the historical data using simplified linear regression

Posted on Чт 06 Октябрь 2016 in data analysis • Tagged with data, analisys, python, pandas, matplotlib, scikit-learn, numpy, machine learning, linear regressionLeave a comment

In this post I want give a simplified explanation of what the linear regression model is and how to apply it for data predictions using python and some open python libraries (including scikit-learning).

Supervised learning is one of the major categories of Machine Learning algorithms. "Supervised" means we already have a dataset in which "correct answers" were given. For example, we have a stock data with open values and close values for a past few years, and we want to predict future values (prices or indexes). Supervised learning is subdivided into Regression problem and Classification problem. Regression problem means we're trying to predict a continuous value output (like predict stock value).

Continue reading

Small drug use data analysis

Posted on Вт 06 Сентябрь 2016 in data analysis • Tagged with matplotlib, data, analisys, pythonLeave a comment

I was looking for health databases for my first reseach. While searching I found the National Survey on Drug Use and Health. Data were collected and prepared for release by Research Triangle Institute, Research Triangle Park, North Carolina. Since these data are available for the general public, I chose this database. I downloaded the data file and the codebook for transcription. Reading the codebook (at least the introduction) is important for better data understanding.

Continue reading