Tech It Yourself

## Redis

July 12, 2021

1. Introduction

Redis is an open source, advanced key-value store and an apt solution for building high performance, scalable web applications.

Redis has main features:

• Redis holds its database entirely in the memory, using the disk only for persistence. It supports very high write and read speed but the data sets can't be larger than the memory. Moreover It is easier to represent  the complex data structures on memory than on disk.
• All the commands in a transaction are serialized and executed sequentially. It can never happen that a request issued by another client is served in the middle of the execution of a Redis transaction. ... Either all of the commands or none are processed, so a Redis transaction is also atomic
• Redis has a relatively rich set of data types such as list, set, sorted set, and hashes when compared to many key-value data stores.
• Redis can replicate data to any number of slaves.
2. Installation & Configuration
2.1 Installation
$sudo apt-get update$sudo apt-get install redis-server
Redis desktop manager on Ubuntu:

2.2. Configuration
Get configuration
redis 127.0.0.1:6379> config get config_name
redis 127.0.0.1:6379> CONFIG GET *
Set configuration
redis 127.0.0.1:6379> config set config_name new_config_value
CONFIG SET loglevel "notice"
redis 127.0.0.1:6379> CONFIG GET loglevel
2.3 Commands
$redis-cli  redis 127.0.0.1:6379> ping PONG Run Commands on the Remote Server $ redis-cli -h host -p port -a password
Get all statistics and information about the server
redis 127.0.0.1:6379> INFO

# Server
redis_version:2.8.13
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:c2238b38b1edb0e2
redis_mode:standalone
os:Linux 3.5.0-48-generic x86_64
arch_bits:64 
2.4. Data Types
2.4.1 Keys
Operate with keys in Redis.
redis 127.0.0.1:6379> COMMAND KEY_NAME
There are some commands related to keys such as:
EXPIREAT command is used to set the expiry of key in Unix timestamp format. After the expiry time, the key will not be available in Redis.
redis 127.0.0.1:6379> Expireat KEY_NAME TIME_IN_UNIX_TIMESTAMP
redis 127.0.0.1:6379> EXPIREAT learning 1293840000
(integer) 1
EXISTS learning (integer) 0
2.4.2 Strings
Redis string is a sequence of bytes with a known length not determined by any special terminating characters. Its length is up to 512 megabytes.
SET name "learning"
SET and GET are Redis commands, name is the key and learning is the string value
There are some commands related to String such as:
GETRANGE command returns substring of the string value stored at the key, determined by the offsets start and end.
redis 127.0.0.1:6379> GETRANGE KEY_NAME start end
redis 127.0.0.1:6379> SET mykey "This is my test key"
OK
redis 127.0.0.1:6379> GETRANGE mykey 0 3
"This"
redis 127.0.0.1:6379> GETRANGE mykey 0 -1
"This is my test key"
2.4.3 Hashes
Hashes are maps between the string fields and the string values.
Hashes can store up to more than 4 billion field-value pairs.
redis 127.0.0.1:6379> HMSET tutorialspoint name "redis tutorial"
description "redis basic commands for caching" likes 20 visitors 23000 
There are some commands related to Hash such as:
HGET command is used to get the value associated with the field.
redis 127.0.0.1:6379> HSET myhash field1 "foo"
(integer) 1
redis 127.0.0.1:6379> HGET myhash field1
"foo"
redis 127.0.0.1:6379> HEXISTS myhash field2
(nil)
2.4.4 Lists
Lists are lists of strings, sorted by insertion order. You can add elements in Redis lists in the head or the tail of the list.
Maximum length of a list is > 4 billion of elements.
There are some commands related to List such as:
LPUSH command inserts all the specified values at the head of the list stored at the key. If the key does not exist, it is created as an empty list before performing the push operation.
redis 127.0.0.1:6379> LPUSH list1 "foo"
(integer) 1
redis 127.0.0.1:6379> LPUSH list1 "bar"
(integer) 2
redis 127.0.0.1:6379> LRANGE list1 0 -1
1) "foo"
2) "bar"
2.4.5 Sets
Sets are an unordered collection of unique strings. Unique means sets does not allow repetition of data in a key.
redis 127.0.0.1:6379> SADD tutorials redis
(integer) 1
(integer) 1
(integer) 1
(integer) 0
redis 127.0.0.1:6379> SMEMBERS tutorials
1) "mysql"
2) "mongodb"
3) "redis"
There are some commands related to Set such as:
SCARD command is used to return the number of elements stored in a set.
redis 127.0.0.1:6379> SADD myset "hello"
(integer) 1
(integer) 1
(integer) 0
redis 127.0.0.1:6379> SCARD myset
(integer) 2
2.4.6 Sorted Sets
It is similar to Sets. The difference is, every member of a Sorted Set is associated with a score, that is used in order to take the sorted set ordered, from the smallest to the greatest score.
redis 127.0.0.1:6379> ZADD tutorials 1 redis
(integer) 1
redis 127.0.0.1:6379> ZADD tutorials 2 mongodb
(integer) 1
redis 127.0.0.1:6379> ZADD tutorials 3 mysql
(integer) 1
redis 127.0.0.1:6379> ZADD tutorials 3 mysql
(integer) 0
redis 127.0.0.1:6379> ZADD tutorials 4 mysql
(integer) 0
redis 127.0.0.1:6379> ZRANGE tutorials 0 10 WITHSCORES
1) "redis"
2) "1"
3) "mongodb"
4) "2"
5) "mysql"
6) "4"
There are some function related to Sorted Sets such as:
ZINCRBY command increments the score of member in the sorted set.
redis 127.0.0.1:6379> ZINCRBY KEY INCREMENT MEMBER
redis 127.0.0.1:6379> ZADD myset 1 "hello"
(integer) 1
redis 127.0.0.1:6379> ZADD myset 1 "foo"
(integer) 1
redis 127.0.0.1:6379> ZINCRBY myzset 2 "hello"
(integer) 3
redis 127.0.0.1:6379> ZRANGE myzset 0 -1 WITHSCORES
1) "foo"
2) "2"
3) "hello"
4) "3"
2.5 HyperLogLog
HyperLogLog is an algorithm that uses randomization in order to provide an approximation of the number of unique elements in a set using just a constant, and small amount of memory (12 kbytes per key with a standard error of 0.81%).
redis 127.0.0.1:6379> PFADD tutorials "redis"
1) (integer) 1
1) (integer) 1
1) (integer) 1
redis 127.0.0.1:6379> PFCOUNT tutorials
(integer) 3 
2.6 Publish - Subscribe
Redis Pub/Sub is a messaging system where the senders (publishers) sends the messages to the receivers (subscribers). The link by which the messages are transferred is called channel.
A client can subscribe any number of channels.
redis 127.0.0.1:6379> SUBSCRIBE redisChat
Reading messages... (press Ctrl-C to quit)
1) "subscribe"
2) "redisChat"
3) (integer) 1 
redis 127.0.0.1:6379> PUBLISH redisChat "Redis is a great caching technique"
(integer) 1  
2.7 Transactions
Transactions allow the execution of a group of commands in a single step.
All commands in a transaction are sequentially executed as a single isolated operation. It is not possible that a request issued by another client is served in the middle of the execution of a Redis transaction.
Redis transaction is also atomic.
Redis transaction is initiated by command MULTI and then you need to pass a list of commands that should be executed in the transaction, after which the entire transaction is executed by EXEC command.
edis 127.0.0.1:6379> MULTI
OK
redis 127.0.0.1:6379> SET tutorial redis
QUEUED
redis 127.0.0.1:6379> GET tutorial
QUEUED
redis 127.0.0.1:6379> INCR visitors
QUEUED
redis 127.0.0.1:6379> EXEC
1) OK
2) "redis"
3) (integer) 1

2.8 Scripting
Scripting uses Lua interpreter. Using EVAL command.
EVAL script numkeys key [key ...] arg [arg ...]
redis 127.0.0.1:6379> EVAL "return {KEYS[1],KEYS[2],ARGV[1],ARGV[2]}" 2 key1
key2 first second
1) "key1"
2) "key2"
3) "first"
4) "second"
2.9 Pipelining
The client can send multiple requests to the server without waiting for the replies, and then reads the replies in a single step. This helps improving protocol performance.
This is executed from Terminal.
\$ (printf "AUTH "abc"\r\nPING\r\nPING\r\nPING\r\n"; sleep 1) | nc localhost 6379
3. Management
3.1 Backup and Restore
Using SAVE command to backup of the current Redis database.
It creates a dump.rdb file in Redis directory.
Restore Redis Data by moving backup file (dump.rdb) into Redis directory and start the server.
Get Redis directory by CONFIG command.
127.0.0.1:6379> CONFIG get dir
BGSAVE command will start the backup process and run this in the background.
127.0.0.1:6379> BGSAVE
Background saving started
3.2 Security
Set the password  so that remote connections need to authenticate before executing a command.
127.0.0.1:6379> CONFIG set requirepass "abc"
OK
127.0.0.1:6379> CONFIG get requirepass
1) "requirepass"
2) "abc" 
127.0.0.1:6379> CONFIG REWRITE
OK
Remember to save config file.
After setting the password, client can connect to server but running the command without authentication, will get NOAUTH Authentication required. Hence, the client needs to use AUTH command to authenticate.
127.0.0.1:6379> AUTH "abc"
OK 
3.3 Client Connection
Set the maximum number of clients to 1000
redis-server --maxclients 1000
List of clients connected:
127.0.0.1:6379> CLIENT LIST
Closes a client connection
127.0.0.1:6379> CLIENT KILL 127.0.0.1:52668
3.4 Partitioning
Partitioning splits data into multiple Redis instances, so that every instance will only contain a subset of keys. This helps to use the strength of distributed system. But It is not easy to manage.
Types of Partitioning:
• Range Partitioning: the keys from ID 0 to ID 10000 will go into instance R0, while the keys from ID 10001 to ID 20000 will go into instance R1, ...
• Hash Partitioning: a hash function is used to map the key into a number and then the data is distributed to different Redis instances.

## UML Class Diagram Relationships, Aggregation, Composition

June 24, 2021

Source: here

### UML Class Diagram Relationships, Aggregation, Composition

There are five key relationships between classes in a UML class diagram : dependency, aggregation, composition, inheritance and realization. These five relationships are depicted in the following diagram:

 UML Class Relationships
The above relationships are read as follows:
• Dependency : class A uses class B
• Aggregation : class A has a class B
• Composition : class A owns a class B
• Inheritance : class B is a Class A  (or class A is extended by class B)
• Realization : class B realizes Class A (or class A is realized by class B)
What I hope to show here is how these relationships would manifest themselves in Java so we can better understand what these relationships mean and how/when to use each one.

Dependency is represented when a reference to one class is passed in as a method parameter to another class. For example, an instance of class B is passed in to a method of class A:
public class A {

public void doSomething(B b) {


Now, if class A stored the reference to class B for later use we would have a different relationship called Aggregation. A more common and more obvious example of Aggregation would be via setter injection:
public class A {

private B _b;

public void setB(B b) { _b = b; }


Aggregation is the weaker form of object containment (one object contains other objects). The stronger form is called Composition. In Composition the containing object is responsible for the creation and life cycle of the contained object (either directly or indirectly). Following are a few examples of Composition. First, via member initialization:
public class A {

private B _b = new B();


Second, via constructor initialization:

public class A {

private B _b;

public A() {
_b = new B();
} // default constructor


Third, via lazy init (example revised 02 Mar 2014 to completely hide reference to B):

public class A {

private B _b;

public void doSomethingUniqueToB() {
if (null == _b) {
_b = new B();
}
return _b.doSomething();
} // doSomethingUniqueToB()


Inheritance is a fairly straightforward relationship to depict in Java:

public class A {

...

} // class A

public class B extends A {

....

} // class B


Realization is also straighforward in Java and deals with implementing an interface:

public interface A {

...

} // interface A

public class B implements A {

...

} // class B


## SNPE Quantization Algorithm

June 23, 2021

Source: developer.qualcomm

# Overview

• Non-quantized DLC files use 32 bit floating point representations of network parameters.
• Quantized DLC files use fixed point representations of network parameters, generally 8 bit weights and 8 or 32bit biases. The fixed point representation is the same used in Tensorflow quantized models.

# Quantization Algorithm

• Quantization converts floating point data to Tensorflow-style 8-bit fixed point format
• The following requirements are satisfied:
• Full range of input values is covered.
• Minimum range of 0.01 is enforced.
• Floating point zero is exactly representable.
• Quantization algorithm inputs:
• Set of floating point values to be quantized.
• Quantization algorithm outputs:
• Set of 8-bit fixed point values.
• Encoding parameters:
• encoding-min - minimum floating point value representable (by fixed point value 0)
• encoding-max - maximum floating point value representable (by fixed point value 255)
• Algorithm
1. Compute the true range (min, max) of input data.
2. Compute the encoding-min and encoding-max.
3. Quantize the input floating point values.
4. Output:
• fixed point values
• encoding-min and encoding-max parameters

## Details

1. Compute the true range of the input floating point data.
• finds the smallest and largest values in the input data
• represents the true range of the input data
2. Compute the encoding-min and encoding-max.
• These parameters are used in the quantization step.
• These parameters define the range and floating point values that will be representable by the fixed point format.
• encoding-min: specifies the smallest floating point value that will be represented by the fixed point value of 0
• encoding-max: specifies the largest floating point value that will be represented by the fixed point value of 255
• floating point values at every step size, where step size = (encoding-max - encoding-min) / 255, will be representable
1. encoding-min and encoding-max are first set to the true min and true max computed in the previous step
2. First requirement: encoding range must be at least a minimum of 0.01
• encoding-max is adjusted to max(true max, true min + 0.01)
3. Second requirement: floating point value of 0 must be exactly representable
• encoding-min or encoding-max may be further adjusted
3. Handling 0.
1. Case 1: Inputs are strictly positive
• the encoding-min is set to 0.0
• zero floating point value is exactly representable by smallest fixed point value 0
• e.g. input range = [5.0, 10.0]
• encoding-min = 0.0, encoding-max = 10.0
2. Case 2: Inputs are strictly negative
• encoding-max is set to 0.0
• zero floating point value is exactly representable by the largest fixed point value 255
• e.g. input range = [-20.0, -6.0]
• encoding-min = -20.0, encoding-max = 0.0
3. Case 3: Inputs are both negative and positive
• encoding-min and encoding-max are slightly shifted to make the floating point zero exactly representable
• e.g. input range = [-5.1, 5.1]
• encoding-min and encoding-max are first set to -5.1 and 5.1, respectively
• encoding range is 10.2 and the step size is 10.2/255 = 0.04
• zero value is currently not representable. The closest values representable are -0.02 and +0.02 by fixed point values 127 and 128, respectively
• encoding-min and encoding-max are shifted by -0.02. The new encoding-min is -5.12 and the new encoding-max is 5.08
• floating point zero is now exactly representable by the fixed point value of 128
4. Quantize the input floating point values.
• encoding-min and encoding-max parameter determined in the previous step are used to quantize all the input floating values to their fixed point representation
• Quantization formula is:
• quantized value = round(255 * (floating point value - encoding.min) / (encoding.max - encoding.min))
• quantized value is also clamped to be within 0 and 255
5. Outputs
• the fixed point values
• encoding-min and encoding-max parameters

## Quantization Example

• Inputs:
• input values = [-1.8, -1.0, 0, 0.5]
• encoding-min is set to -1.8 and encoding-max to 0.5
• encoding range is 2.3, which is larger than the required 0.01
• encoding-min is adjusted to −1.803922 and encoding-max to 0.496078 to make zero exactly representable
• step size (delta or scale) is 0.009020
• Outputs:
• quantized values are [0, 89, 200, 255]

## Dequantization Example

• Inputs:
• quantized values = [0, 89, 200, 255]
• encoding-min = −1.803922, encoding-max = 0.496078
• step size is 0.009020
• Outputs:
• dequantized values = [−1.8039, −1.0011, 0.0000, 0.4961]

## UML Diagrams

May 10, 2021

1. Main diagrams:

Behavioral Diagram

•     Activity Diagram
•     Use Case Diagram
•     Timing Diagram
•     State Machine Diagram
•     Communication Diagram
•     Sequence Diagram

Structural Diagram

•     Class Diagram
•     Object Diagram
•     Component Diagram
•     Composite Structure Diagram
•     Deployment Diagram
•     Package Diagram
•     Profile Diagram

2. Some important diagrams:

Activity Diagram

It is generally used to describe the flow of different activities and actions. These can be both sequential and in parallel. It is used to:

• Model workflows between/within use cases
• Model complex workflows in operations on objects
• Model in detail complex activities in a high level activity Diagram
Activity Diagram - Modeling a Word Processor

• Open the word processing package.
• Create a file.
• Save the file under a unique name within its directory.
• Type the document.
• If graphics are necessary, open the graphics package, create the graphics, and paste the graphics into the document.
• Save the file.
• Print a hard copy of the document.
• Exit the word processing package.
Activity Diagram Example - Student Enrollment
• An applicant wants to enroll in the university.
• The applicant hands a filled out copy of Enrollment Form.
• The registrar inspects the forms.
• The registrar determines that the forms have been filled out properly.
• The registrar informs student to attend in university overview presentation.
• The registrar helps the student to enroll in seminars
• The registrar asks the student to pay for the initial tuition.

Use Case Diagram

A cornerstone part of the system is the functional requirements that the system fulfills. Use Case diagrams are used to analyze the system’s high-level requirements. These requirements are expressed through different use cases. We notice three main components of this UML diagram:

Functional requirements – represented as use cases; a verb describing an action

Actors – they interact with the system; an actor can be a human being, an organization or an internal or external application

Relationships between actors and use cases – represented using straight arrows

Dependencies

A number of dependency types between use cases are defined in UML. In particular, <<extend>> and <<include>>.

• <<extend>> is used to include optional behavior from an extending use case in an extended use case.
• <<include>> is used to include common behavior from an included use case into a base use case in order to support re-use of common behavior.

The example below depicts the use case UML diagram for an inventory management system. In this case, we have the owner, the supplier, the manager, the inventory clerk and the inventory inspector.

Within the circular containers, we express the actions that the actors perform. Such actions are: purchasing and paying for the stock, checking stock quality, returning the stock or distributing it. As you might have noticed, use case UML diagrams are good for showing dynamic behaviors between actors within a system, by simplifying the view of the system and not reflecting the details of implementation.

Sequence Diagram

Sequence diagrams describe the sequence of messages and interactions that happen between actors and objects. Actors or objects can be active only when needed or when another object wants to communicate with them. All communication is represented in a chronological manner.
It is used in software development to represent the architecture of the system and how the different components are interconnected

Timing diagram

We are not interested in how the objects interact or change each other, but rather we want to represent how objects and actors act along a linear time axis.

The main components of a timing diagram are:

• Lifeline – a line forming steps since the individual participant transits from one stage to another.
• State timeline – a single lifeline can go through different states within a pipeline
• Duration constraint – a time interval constraint that represents the duration of necessary for a constraint to be fulfilled
• Time constraint – a time interval constraint during which something needs to be fulfilled by the participant
• Destruction occurrence – a message occurrence that destroys the individual participant and depicts the end of that participant’s lifeline

The stages of human growth:

State Machine (Statechart) Diagram

Describe the different states of a component and how it changes based on internal and external events within a system.
A chess game state machine:

Statecharts find usage mainly in forward and reverse engineering of different systems.

Class Diagram

Class UML diagram is the most common diagram type for software documentation. Since most software being created nowadays is still based on the Object-Oriented Programming paradigm, using class diagrams to document the software turns out to be a common-sense solution. This happens because OOP is based on classes and the relations between them.

Class diagrams contain classes, alongside with their attributes (also referred to as data fields) and their behaviors (also referred to as member functions).

The ‘Checkings Account' class and the ‘Savings Account' class both inherit from the more general class, ‘Account'.

Object Diagram

Object diagrams help software developers check whether the generic abstract structure that they have created (class diagram), represents a viable structure when put into practice, i.e: when the objects of a class are instantiated. Some developers see it as a secondary level of accuracy checking.

The class 'Client' has an instance "James".
The class 'Checkings' and 'Savings' has instances Checkings and Savings account.
- 'account_number' and 'routing_number' are different for the Checkings and Savings account. It makes more sense to put those attributes in their respective classes, rather than in the more generic class 'Account'.
- 'wire_routing_number' and 'bic' are not used.

Component Diagram

Component Diagram describes the organization and wiring of the components in a system.

Component diagrams help model implementation details and double-check that every aspect of the system's required functions is developed.

The components are less physical and more conceptual stand-alone design elements.

The components provide or require interfaces to interact with other components in the system.

A component is a logical unit block of the system, a slightly higher abstraction than classes.

The Difference Between a Package Diagram and a Component Diagram:

- Package diagram elements are always public, while component diagram elements are private.

- Components are groups of classes that are deployed together and packages are a general grouping device for model elements. Packages can group any model elements, even things like use cases, but in practice they usually group classes, so components and packages tend to be synonymous.

Parts of a component diagram:

Component

Interface

A full circle represents an interface created or provided by the component. A semi-circle represents a required interface (input).

Dependencies
Dependencies among components
Port
A port is to help expose required and provided interfaces of a component.
Component Diagram - Online Shopping

Component Diagram for ATM

Package Diagram

It is used to show the organization and relationship of various model elements in the form of packages.

A package is a grouping of related UML elements, such as diagrams, documents, classes, or even other packages.

Parts of Package Diagram:

There are two sub-types involved in dependency. They are <<import>> & <<access>>.
Importing a package is equivalent to importing it's all public elements. So, the visibility of import can be thought of the visibility of the elements (imported elements).
import is public
access is private

Package Diagram Example - Order Processing System
The Problem Description
Design package diagram for "Track Order" scenario for an online shopping store. Track Order module is responsible for providing tracking information for the products ordered by customers. Customer types in the tracking serial number, Track Order modules refers the system and updates the current shipping status to the customer.
Identify the packages of the system
There is a track order module, it has to talk with other module to know about the order details, let us call it "Order Details".
Next after fetching Order Details it has to know about Shipping details, let us call that as "Shipping".
Identify the dependencies in the System
Track order should get order details from "Order Details" and "Order Details" has to know the tracking info given by the customer. Two modules are accessing each other which suffices <<access>> dual dependency

To know shipping information, "Shipping" can import "Track Order" to make the navigation easier.
Track Order dependency to UI Framework.

## mAP (mean Average Precision)

May 06, 2021

1. Introduction

mAP is a popular evaluation metric used for object detection (localisation and classification).

Object detection models such as SSD, YOLO make predictions of a bounding box and a class label.

2. True/False Positive for bounding box

For bounding box, we measure the overlap between the predicted bounding box and the ground truth bounding box IoU (intersection over union).

if the IoU value of prediction > IoU threshold, then we classify the prediction as True Positive (TF). On the other hand, if IoU value of prediction < IoU threshold, we classify it as False Positive (FP).

True or False Positive depends on
IoU threshold. In the example above if using IoU threshold=0.2 then FP wil become TP.
In object detection:
True Positive: if True Positive for both classification and bounding box
False Positive: if otherwise

3. Calculate mAP
3.1 mAP

For the picture above with IoU threshold is o.5
- Average Precision (AP) is the area under the precision-recall curve.
- mAP (mean average precision) is the average of AP.
- AP is calculated for each class and averaged to get the mAP.
The mean Average Precision or mAP score is calculated by taking the mean AP over all classes and/or overall IoU thresholds, depending on different detection challenges.

## 3.2 Interpolated precision

The interpolated precision, p_interp, is calculated at each recall level, r, by taking the maximum precision measured for that r.
Consider a model that predicted a dataset contains 5 apples. We collect all the predictions made for apples in all the images and rank it in descending order according to the predicted confidence level.

The Precision at Rank 4th =  (1 + 1) / (1 + 1 + 1+ 1) = 0.5
The Recal at Rank 4th =  (1 + 1) / 5= 0.4

we smooth out the zigzag pattern:

The orange line is transformed into the green lines. We replaced the precision value for recall ȓ with the maximum precision for any recall ≥ ȓ.
We replaced all precision in [0.4:0.8] by max precision at 0.8
Pascal VOC2008 used the 11-point interpolated AP, we divide the recall value from 0 to 1.0 into 11 points — 0, 0.1, 0.2, …, 0.9 and 1.0.

AP = (1/11) * (1+1+1+1+1 +0.57+0.57+0.57+0.57+0.5+0.5)

COCO used a 101-point interpolated AP
AP75 is AP@.75 means the AP with IoU threshold=0.75
AP50 is AP@.50 means the AP with IoU threshold=5

## Metrics to evaluate a classification model's predictions

May 05, 2021

1. Concepts

1.1 Confusion matrix

An NxN table that summarizes how successful a classification model's predictions were.

It shows the correlation between the label and the model's classification.

N represents the number of classes.

In a confusion matrix, one axis is the predicted label, and one axis is the actual label

An example when N=2

actual tumors: 19
tumors correctly classified (true positives): 18
tumors incorrectly classified (false negative): 1
non-tumors: 458
non-tumors correctly classified (true negatives): 452
non-tumors incorrectly classified (false positives): 6

1.2. True vs. False and Positive vs. Negative

Considering the example is based on The Boy Who Cried Wolf story.

Let's make the following definitions:

• "Wolf" is a positive class.
• "No wolf" is a negative class.

A true positive (TP) is an outcome where the model correctly predicts the positive class.
A true negative (TN) is an outcome where the model correctly predicts the negative class.

A false positive (FP) is an outcome where the model incorrectly predicts the positive class.
A false negative (FN) is an outcome where the model incorrectly predicts the negative class.

True/False is predicted
Positive/Negative is ground-truth

1.3 Accuracy

For Binary classification

Dataset has 100 tumor examples, 91 are benign (90 TNs and 1 FP) and 9 are malignant (1 TP and 8 FNs)
Consider the first model:
- In 9 malignant tumors, the model only correctly identifies 1 as malignant
- In 91 benign tumors, the model correctly identifies 90 as benign
Consider the second model that always predicts benign. It also has the same accuracy (91/100 correct predictions).
The first model is no better than second model that has no predictive ability to distinguish malignant tumors from benign tumors.
=> Accuracy doesn't tell the full story when you're working with a class-imbalanced data set, where there is a significant disparity between the number of positive and negative labels.

1.3 Precision and Recall
Precision: What proportion of positive identifications was actually correct?
In all positive cases how many positive cases are correctly predicted?

when it predicts a tumor is malignant, it is correct 50% of the time.

Recall: What proportion of actual positives was identified correctly?
In all positive predicted cases how many positive cases are correctly predicted?

when it correctly identifies 11% of all malignant tumors.

improving precision typically reduces recall and vice versa.
Consider Classifying email messages as spam or not spam when changing threshold.

When the classification threshold is increased, the certainty is increased and FP is decreased so Precision is increased. In other hand, we predict more cases as Negative but the probability that is False is increased and Recall is decreased.
Q: If model A has better precision and better recall than model B, then model A is probably better.
A: In general, a model that outperforms another model on both precision and recall is likely the better model. Obviously, we'll need to make sure that comparison is being done at a precision / recall point that is useful in practice for this to be meaningful. For example, suppose our spam detection model needs to have at least 90% precision to be useful and avoid unnecessary false alarms. In this case, comparing one model at {20% precision, 99% recall} to another at {15% precision, 98% recall} is not particularly instructive, as neither model meets the 90% precision requirement. But with that caveat in mind, this is a good way to think about comparing models when using precision and recall.
1.4 ROC curve
ROC curve (receiver operating characteristic curve) is a graph showing the performance of a classification model at all classification thresholds.
The 2 axes:
• True Positive Rate
• False Positive Rate

Lowering the classification threshold classifies more items as positive, thus increasing both False Positives and True Positives.
To compute the points in an ROC curve, we could evaluate a classification model many times with different classification thresholds, but this would be inefficient. Fortunately, there's an efficient, sorting-based algorithm that can provide this information for us, called AUC.

## How to use the ROC Curve?

We can generally use ROC curves to decide on a threshold value. The choice of threshold value will also depend on how the classifier is intended to be used. So, if the above curve was for a cancer prediction application, you want to capture the maximum number of positives (i.e., have a high TPR) and you might choose a low value of threshold like 0.16 even when the FPR is pretty high here.

This is because you really don’t want to predict “no cancer” for a person who actually has cancer. In this example, the cost of a false negative is pretty high. You are OK even if a person who doesn’t have cancer tests positive because the cost of false positive is lower than that of a false negative. This is actually what a lot of clinicians and hospitals do for such vital tests and also why a lot of clinicians do the same test for a second time if a person tests positive. (Can you think why doing so helps? Hint: Bayes Rule).

Otherwise, in a case like the criminal classifier from the previous example, we don’t want a high FPR as one of the tenets of the justice system is that we don’t want to capture any innocent people. So, in this case, we might choose our threshold as 0.82, which gives us a good recall or TPR of 0.6. That is, we can capture 60 per cent of criminals.

1.5 AUC (Area under the ROC Curve)

AUC near to the 1 which means it has a good measure of separability. A poor model has AUC near to the 0 which means it has the worst measure of separability. Some importtant features:
• It is threshold invariant i.e. the value of the metric doesn’t depend on a chosen threshold.
• It is scale-invariant i.e. It measures how well predictions are ranked, rather than their absolute values.

Consider some cases when plotting the distributions of the classification probabilities.
A perfect classification when when two curves don’t overlap. It can distinguish between positive class and negative class.

The ROC curve:
In this case the AUC = 1
When two distributions overlap
The ROC curve:

In this case the AUC is 0.7, it means there is a 70% chance that the model will be able to distinguish between positive class and negative class.
When two distributions completely overlap. This is the worst case.
The ROC curve:

In this case the AUC is approximately 0.5, the model has no discrimination capacity to distinguish between positive class and negative class.
The AUC=0
The model is predicting a negative class as a positive class and vice versa.
Finally, we have:

In a multi-class model, we can plot N number of AUC ROC Curves for N number classes using the One vs ALL methodology. So for example, If you have three classes named X, Y, and Z, you will have one ROC for X classified against Y and Z, another ROC for Y classified against X and Z, and the third one of Z classified against Y and X.