I had started a course on Machine Learning on Coursera by Stanford University in February. As of now I have completed more than half of the course (up till Week 6) and I have to start with SVM’s next. I wanted to document my progress up till now and mention in brief, the things that I have learnt in ML since.
First of all, I’d like to state that I have signed up for this course two times before this and ended up leaving it in between due to my semester exams and lack of determination. I had never made it past week 3. So I guess the fact that I have reached up till week 6 this time guarantees the fact that I have strengthened my will power in this regard, since (there were mid semester exams in between this time too, but I managed doing my programming exercises well before the deadlines).
<rant> Implementing algorithms in Octave really irks me out! Its like, so many times that I have understood the concepts crystal clear but when i end up implementing it in Octave, it is so shitty that it saps all the fun out of it. I have actually spent hours racking my brains to figure out the right code, when the issue has only been a damn parenthesis or some stupid bullshit. This had led me to doubt my interest in the course a few times. I don’t want to let that happen further. </rant>
So this time, I focused less on writing the code in Octave, and more on the concepts. I’m sure when I get down to implementing these algorithms on some library, it wouldn’t be this messy. And it had to be done before Octave sapped out all the confidence and excitement towards ML out of me. I do feel a bit sorry about my slightly wrong approach towards the course but I do feel good somewhere to be moving on with it. If you quiz me with ML concepts and even tricks to implement algorithms, I’m sure i’ll fair pretty well, but if you ask me to implement them again in Octave, then I’m sorry!
Some of the concepts I have learnt through this course include:-
- Linear Regression – It is a regression type algorithm, i.e. it tries to find the exact value from the hypothesis.
- Gradient Descent – A method we employ to find the minimum of the cost function, and find the optimum parameters for the algorithm.
- Normal Equation – Normal equation gives us a method to solve for parameters analytically without using gradient descent (no need to iterate). There’s no need to choose the learning rate alpha and is used when no of features is extremely large.
- Logistic Regression – Used as a classification algorithm. Our hypothesis gives 0 or 1 value, as it is a sigmoid function. To calculate optimum theta parameters, we run gradient descent on our data set just like linear regression. There is multi-class classification too.
- Regularization – Regularization is used to solve the problem of over-fitting. If the algorithm fits on the data too well, it might cause high error when applying to new training examples. So we introduce a regularization parameter, which is used to shrink the magnitude of the cost function.
- Neural Networks – Lectures on Neural Networks taught me about a new form of developing an algorithm which learn their own features by employing learning by propagating through various layers and the most optimum hypothesis function is sought out. We forward propagate on our data (and the layers we have created) to calculate how our hypothesis looks like and then we backpropagate on our training set and employ gradient descent (modified kind) to find the optimum values of parameter theta .
I have to admit, I initially found Neural networks a bit daunting. And I think there’s a need to revisit its programming assignment to understand things in a better manner. I’m sure another revision through the lecture videos will sort this out easily.
- General but important advice for applying machine learning algorithm and choosing the best suited algorithm for your problem. Also learnt about the various diagnostics to run on the algorithm, metrics to estimate the suitability of a particular approach and the best possible way to debug your algorithm, based on the diagnosis.
This is a concise summary of what I have learnt to this date. I will upload another such summary for the remaining course as I progress forward with the coming programming exercises.
When I’m done with the course, I plan to write one blog post each for things I learnt every week in the course. It will be a great form of revision for me, and will also serve as a good reference to other people wanting to know about ML.
I am not sure whether I should upload all the work on ML on a repository on github because I know people can use these codes for their own programming assignments and it wouldn’t serve the purpose of their learning. But isn’t it unethical on THEIR part? Why should I care about it?