D.I.Y. Logistic Regression Model

NINishpish•Created July 4, 2020

670 views

Instructions

Table of contents: 1). Description 2). Instructions 3). How it works (Optional, in “Notes and Credits”) Description: With this program, you can make your own customized logistic regression model! Essentially, you input data and an algorithm returns the class that it thinks best represents the inputs. For example, the logistic model could have you enter a person’s height and weight. Then it would use an algorithm to determine which body type it “thinks” you are describing, perhaps determining if the person is overweight. To be able to do this, it first needs to be trained. Using this example, you could train it by entering the height and weight of overweight and non-overweight individuals. Instructions: Turn turbo mode on. (There are instructions for doing this in the project.) Once you do so, answer the questions, customizing the logistic model. In the overweight example, you would choose to have two input types (height and weight) and you could name them ‘height’ and ‘weight’. Next it will ask you how many outputs you would like the model to have. In this example, there are two possible outputs, ‘overweight’ and ‘not overweight’. After you finish setting up these parameters, you will see a graphic interpretation of the model you just made. You can mouse over the dots to see what each one represents. Since your logistic model does not “know” anything yet, you need to feed it data. To do this, press ‘Feed’. It will then have you enter the values for all your inputs (height-weight pairs for our example). Once you are done, choose the correct output for the values you entered (‘overweight’ or ‘not overweight’). You will then return to the menu. Now either feed it more data or press ‘train’ to train the model on the data you fed it. This minimizes the error of the model. The lower the error, the more accurately it will assign your inputs (height-weight pairs) to the correct output category (overweight or not overweight). If properly trained, it should accurately categorize inputs that differ from the training sample. An error lower than 2 is good, but it won't be perfect unless it reaches 0. You can exit training whenever you want by clicking anywhere on the screen. Click again to return to the home menu. From here, you can either feed it more data or test the model. Just remember to train again if you feed it more. If you click ‘Test’ you will be asked for input values, and the model will predict an output based on the inputs. It will list the most probable output at the top and the other possibilities below in decreasing order. Don't be surprised if your model gave you the wrong output. This is probably because you have not yet fed it enough data, or if the error is not sufficiently low. If this happens, just feed it more or train again. Have fun! (How it works is in “Notes and Credits”)

Description

If you still don't get what the program does after reading the description, watch this video: https://www.youtube.com/watch?v=yIYKR4sgzI8 How it works: This is an optional section for those with advanced math knowledge. You may be wondering how the model calculates outputs based on its inputs. It uses the formula σ(x·W+b) = o, where x is the input and o is the output, both represented by vectors. W is a matrix of weights, each of which is a variable that can be tweaked to optimize the model. b is a vector that serves a similar purpose to W, though it is used in a different part of the equation. Finally, σ is the sigmoid function: σ(x) = 1/(1+e^-x). The function takes a number and maps it to a value between 0 and 1. This entire process is essentially how the logistic regression model calculates its outputs based on its inputs. But how does it optimize? It does so by minimizing the error, which is computed using this function: c = (1/2)Σ(yᵢ-(σ(xᵢ·W+b))² In this equation, vector y equals the correct outputs for training example i. We want this error to be as low as possible, since if the error is 0, the predicted outputs exactly equal the correct outputs. To do this, we use derivatives. As a reminder, derivatives tell us the slope of a function at a given point. (In this case, since the function has multiple variables, we will use partial derivatives.) So if we change W and b, the adjustable parameters, by the negative partial derivative of the error with respect to W and b, respectively, we will lower the error. These derivatives are listed below. ∂c/∂W = Σ-xᵢ(yᵢ-σ(xᵢ·W+b))(σ(xᵢ·W+b))(1-σ(xᵢ·W+b)) ∂c/∂b = Σ-(yᵢ-σ(xᵢ·W+b))(σ(xᵢ·W+b))(1-σ(xᵢ·W+b)) A problem that arises is that if we use the plain partial derivatives, we will overshoot and never reach the minimum. So we must scale them down by a certain amount, called a learning rate, η. The value of η is usually in the order of 10^-n. Our final formulas are then: W₂ = W₁ - ηΣ-xᵢ(yᵢ-σ(xᵢ·W+b))(σ(xᵢ·W+b))(1-σ(xᵢ·W+b)) b₂ = b₁ - ηΣ-(yᵢ-σ(xᵢ·W+b))(σ(xᵢ·W+b))(1-σ(xᵢ·W+b)) One thing to be aware of is that we need to recursively apply the formulas to noticeably minimize the error, because one iteration only makes a small change. And that’s it! (Note: in the project, I actually use the difference quotient formula for calculating the derivatives. This is because, strangely, both methods seem approximately the same speed. The main difference is that the difference quotient uses much less code. Aside from that, I made a few other minor changes, to help it run better in Scratch.)

Project Details

Project ID409588348

CreatedJuly 4, 2020

Last ModifiedJune 7, 2021

SharedJuly 16, 2020

Visibilityvisible

CommentsAllowed