how dependency injection and container develop

reference : http://jaceju.net/2014-07-27-php-di-container/

why we need dependency injection ?
cause we want to remove  dependency. Instead of creating object in the inner, we pass the object via parameter during construction.

if we want to change the way we auth or how session store, we may need to change class app too.

much better now,and we want different ways to auth,we had better to change auth to interface

looks great, then why we need the container?
sometimes we don’t know which way we want until we run the program.(http or db auth). we can register the class we want into a container,and retrieve later.

looks perfect,but every time we need to create an app object,we need to write all this code. it’s a wise choice to wrap them into a function.

could we make it better?
yes,we can. Every time we create an app object, we have to inject the dependency into container. what if the container can auto load the dependency by itself, then we can free from providing the dependency. it’s time to present reflection class.
http://php.net/manual/en/class.reflectionclass.php

holy shit,only one line code we got what we want.

 

html summary

1.html entity 轉換

显示结果 描述 实体名称 实体编号(10進制&16進制)
空格    ( )
< 小于号 &lt; &#60;(&#x3C;)
> 大于号 &gt; &#62;(&#x3e;)
& 和号 &amp; &#38;(&#x26;)
双引号 &quot; &#34;(&#x22;)
单引号 &apos; &#39;(&#x27)

2.cache
#don’t use cache:
cache-control:no-store
cache-control:no-cache
(redo anything no matter what)
#use cache
cache-control:max-age:[second]
expires: [date]
cache-control:public
cache-control:private
validation:
(1) if set expires or max-age, and the cache is still fresh, browser will load the file direct from cache,without making a request.
(2) if browser can not load cache from local, it will send request to server to validate whether the cache is still fresh,if yes,return 304 , or  handle the request.
last-modified/if-modified-since
Etag/if-none-match
(3)if no caches are available, browser will send request to  server

3.same origin policy & cors
ref:
https://web-security.guru/en/web-security/same-origin-policy
http://www.ruanyifeng.com/blog/2016/04/cors.html
https://enable-cors.org/

sop:
same protocol
same domain
same port
we can load  img ,video,script etc from other domain,but we can’t
(1)Reading Ajax responses via XMLHttpRequest and fetch from another origin
(2)Reading and writing the Document Object Model (DOM) of another origin
(3)Reading and writing stored data (Cookie, session & local storage) of another origin

cors:
create a white list to allows one origin to access resources from another origin.
It can not secure your content , but it can protect innocent user from running scripts in a malicious website.  Because the malicious website does not in the  Access-Control-Allow-Origin.
if u want to share your content to other origin, also keep confidential,u can use Oauth2

security summary

u can’t trust user input

1.sql injection
attack:
u need urlencode
1.login bypass (‘ or 1=1 /*)
2.integer bypass( 1;drop table users)
3.select information(‘ union select username,NULL,NULL from user /*)
4.insert or update information (‘ update  groupid where user = 100 /*)
5.second order injection(create a user name like ‘ drop table user/*,single quote may be correctly encoded and store this record into db.when u execute query with this username,problem occur)
protection:
prepared statement

2.image upload
attack:
1.upload php file directly,no extension check
2.filename with invisible character(fool.php(%00).jpg) pass the extension check,but store in server as fool.php
protection:
if there is a upload file
validate upload path(is_empty,is_dir,is_writable)
if upload success
check extension (white list extension)
check mime(white list mime)
check filename(limit length,no invisible character,remove space)
check size and dimension
rename file
do not overwrite file

3.csrf
attack:
1.forces an end user to execute unwanted actions on a web application in which they’re currently authenticated by sending a link via email or chat(transfer money via e-banking)
protection:
add csrf token in cookie,check the token in the backend,then regenerate. make sure the request was sent by the correct end user.
two-factor authentication

4.xss
attack:
http://wooyun.jozxing.cc/search?keywords=author%3A+%E5%BF%83%E4%BC%A4%E7%9A%84%E7%98%A6%E5%AD%90&content_search_by=by_bugs
1.反射型xss

2.dom xss

protection:
1.http-only cookie(prevent read cookie from document)
2.use htmlspecialchars with ENT_QUOTES to filter output in html

 

php summary

1.cgi & fastCgi
request(index.php)->webserver–(cgi)–>php interpreter(need to initialize environment every time)
cgi is a protocal,standardize the data transmitted between the web server and the php interpreter.(post data,header data,url,query string)

request(index.php)->webserver–(fastCgi)—>php interpreter(don’t need to initialize environment every time, a master will handle this)
fastCgi is a protocal as well. But instead initialize environment every time when a request comes in, fastCgi will create a master handle the initialization,and fork a php interpreter worker to handle the request.php-fpm is a fastCgi program.

2.psr
https://github.com/php-fig/fig-standards/tree/master/accepted
https://github.com/squizlabs/PHP_CodeSniffer

3.

mysql summary

1.設計數據庫時盡量避免冗餘,除非為了速度,並且數據不常改動

2.比起update一條record,不如增加一條record更新數據,這樣不用加鎖

3.mysql master-slave + ha

https://www.slideshare.net/matsunobu/automated-master-failover

4.組合索引

4.1用作index的每個field佔用空間應該盡量小,為了裝下更多data,reduce b+tree’s height

4.2最左前缀匹配原则,mysql会一直向右匹配直到遇到范围查询(>、<,between、like),所以範圍查詢的field放到index最後面

4.3sql中的=和in可以打亂順序,優化器會優化sql,保證會用上index

4.4尽量选择区分度高的field作为index

5.explain slow query,make sure using the right index.

5.1 type

(1)system: the table has only zero or one row,special case of const.
example:explain select * from (select * from t3 where id=3952602) a

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY system NULL NULL NULL NULL 1
2 DERIVED t3 const PRIMARY,idx_t3_id PRIMARY 4 1

(2)const:the table has only one matching row which is indexed. example:select * from t3 where id=3952602;

id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t3 const PRIMARY,idx_t3_id PRIMARY 4  const 1

(3) eq_ref:all parts of an index are used by the join and the index is PRIMARY KEY or UNIQUE NOT NULL.
example: explain select * from t3,t4 where t3.id=t4.accountid;

id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t4 ALL NULL NULL NULL NULL 1000
2 SIMPLE t3 eq_ref PRIMARY,idx_t3_id idx_t3_id 4 db.accountid 1

(4) ref:all of the matching rows of an indexed column are read for each combination of rows from the previous table.(index not unique)
example:explain select * from t3,t4 where t3.id=t4.accountid;

id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t4 ALL NULL NULL NULL NULL 1000
2 SIMPLE t3 ref PRIMARY,idx_t3_id idx_t3_id 4 db.accountid 1

(5) ref_or_null: same as ref but include null

(6) index_merge: the join uses a list of indexes to produce the result set

(7) unique_subquery: an IN subquery returns only one result from the table and makes use of the primary key.
example:value IN (SELECT primary_key FROM single_table WHERE some_expr)

(8) index_subquery: the same as unique_subquery but returns more than one result row.

(9) range: an index is used to find matching rows in a specific range, typically when the key column is compared to a constant using operators like BETWEEN, IN, >, >=, etc.
example:explain select * from t3 where id=3952602 or id=3952603;

(10) index:the entire index tree is scanned to find matching rows

(11) all: the entire table is scanned to find matching rows for the join

5.2 key: 实际使用的index

6.innoDB support transacation, foreign key and row lock.

7.水平sharding
https://medium.com/@Pinterest_Engineering/sharding-pinterest-how-we-scaled-our-mysql-fleet-3f341e96ca6f

8.use utf8mb4 https://mathiasbynens.be/notes/mysql-utf8mb4

9.常用優化skill

9.1 where xxx in (subquery) => select * from a join (subquery) as b on a.xxx = b.xxx (using index xxx)
9.2 只要一行record就limit 1,mysql找到一條record就會停止search
9.3 只select需要的field
9.4 where a or b => a union all b

gradient descent

reference:
https://www.zhihu.com/question/36301367/answer/142096153
https://www.kdnuggets.com/2017/04/simple-understand-gradient-descent-algorithm.html

prerequisite
1.derivative (only one variable)

(1)    \begin{equation*} \frac{d}{d_{x}}f(x) = f'(x_{0}) = \lim_{\Delta x\to 0} \frac{\Delta y}{\Delta x} = \lim_{\Delta x \to 0} \frac{f(x_{0}+\Delta x)-f(x_{0})}{\Delta x} \end{equation*}

it measures the sensitivity of f(x) with respect to a trivial change of x (slope of the tangent line)
(from wiki)
2. partial derivative (multi variable, but not any direction)
at lease two variables.
Actually,it’s derivative based on each dimension(x bar,y bar ,z bar….)

(2)    \begin{equation*} \frac{\partial}{\partial x_{i}}f(X) = \lim_{h\to 0} \frac{f(x_{1},x_{2}..,x_{i}+h,x_{i+1},..,x_{n})-f(x_{0},x_{1}..,x_{n})}{h} \end{equation*}


3.directional derivative (multi variables ,any direction)
break a vector down  into each dimension,so we can use partial derivative to solve this

actually it’s just vector dot product ([partial derivative] * [vector in this direction])
it’s all about how the vector in this direction affects the partial derivative.

     \begin{align*} D_{\vec v}f(\vec i) = \nabla_{\vec v}f(\vec i) = \lim_{h \to 0} \frac{f(\vec i+h \vec v)-f(\vec i)}{h} = \sum_{n=0}^{m} \vec v_{n}*\frac{\partial f}{\partial \vec i_{n}} \end{align*}

we only care about the direction of the vector,so vector v is a unit vector

so what is gradient?
gradient is about finding a direction which we can get the steepest slope. it means we need to find a direction which maximize [partial derivative ]* [initial vector].
multiplication of two vectors ,obviously when they stick together(theta = 0) ,we can get the steepest slope. meanwhile ,we only care about the direction.
so the gradient = [partial derivative]

In conclusion, if we want to minimize the cost function, we can decrease each direction in the vector by its gradient respectively. This is gradient descent.

another way to comprehend.
 \Delta C = \bigtriangledown C * \Delta v \;\;\;\;\; \bigtriangledown C (matrix\; of\; partial\; derivative)
if we want to minimize the cost , we should find \Delta v to make \Delta C negative.
suppose we choose \Delta v = - \alpha * \bigtriangledown C , alpha is a small,positive parameter.
then \Delta C = - \alpha * ||\bigtriangledown C||^2
so the cost will be negative.this is what we looking for.
when
 v -> v' = v - \alpha * \bigtriangledown C
, we can minimize the cost function

coursera marchine learining unit one

definition
a computer program is said to learn from experience E with respect to task T and some performance measure P, if its performance on T, as measured by P , improves with experience E.

type
1. supervised learning
1.1 classification (mapping to label, discrete)
1.2 regression (mapping to continuous number)
2.unsupervised learning (cluster data)

supervised learning workflow

(from coursera)

how to measure the accuracy of the hypothesis (linear)
#linear regression cost function

     \begin{equation} \[ J(\theta_{0},\theta_{1}) = \frac{1}{2m}\sum_{i=1}^{m} (\hat{y}_{i}-y_{i})^2 =\frac{1}{2m}\sum_{i=1}^{m}(h_{\theta}(x_{i})-y_{i})^2 \] \[ \hat{y} : \; predictive \;value \] \[ h_{\theta}(x) : \;linear function \;form \;by \; \theta_{0},\theta_{1} \] \end{equation}

find the most probable theta to minimize the cost function.when the cost function equal 0 means all the data plot lies in the line.

how to find the probable theta to minimize residual
#gradient descent
why gradient descent works
repeat until convergence (simultaneous update all the theta) {

     \[ \theta_{i} := \theta_{i}-\alpha\frac{\partial}{\partial \theta_{i}} J(\theta_{0},\theta_{1}) \]

}
where i = {0,1}
#gradient descent for linear regression
repeat until convergence {

     \[\theta_{0} := \theta_{0} - \alpha\frac{1}{m}\sum_{i=1}^{m}(h_{\theta}(x_{i})-y_{i}) \] \[\theta_{1} := \theta_{1} - \alpha\frac{1}{m}\sum_{i=1}^{m}((h_{\theta}(x_{i})-y_{i})*x_{i})\]

}
detail:
https://math.stackexchange.com/q/1695446

hidden markov model (decoding)

application:
語音識別
分詞
1、隐藏状态 (天气):Sunny,Cloudy,Rainy;
2、观察状态(海藻湿度):Dry,Dryish,Damp,Soggy;
3、初始状态概率: Sunny(0.63), Cloudy(0.17), Rainy(0.20);
4、状态转移矩阵:
today weather
sunny
cloudy
rainy
yesterday weather
sunny
0.5
0.375
0.125
cloudy
0.25
0.125
0.625
rainy
0.25
0.375
0.375
5、混淆矩阵:
observed states
dry
dryish
damp
soggy
hidden states
sunny
0.6
0.2
0.15
0.05
cloudy
0.25
0.25
0.25
0.25
rainy
0.05
0.1
0.35
0.5

Finding most probable sequence of hidden states
brute force
p(hidden state|(dry,damp,soggy)) = max(p(dry|sunny,damp|sunny,soggy|sunny),p(dry|sunny,damp|sunny,soggy|cloudy),p(dry|sunny,damp|sunny,soggy|rainy)…… )

viterbi (dynamic programming)
calculate the partial best path based on each phase,  and record the partial best path we walk through
phase 1(dry):
p(sunny) = initial hidden state * p(dry|sunny)=0.63 *0.6 = 0.378
p(rainy) =  initial hidden state * p(dry|rainy) = 0.17 * 0.05 = 0.0425
p(cloudy) = initial hidden state * p(dry|cloudy) = 0.2 * 0.25 =0.01
phase 2(damp):
p(sunny) = max(p(previous phase sunny)*p(sunny|sunny),p(previous phase rainy)*p(sunny|rainy),p(previous phase cloudy)*p(sunny|cloudy)) *p(damp|sunny)=max(0.378*0.5,0.0425*0.25,0.01*0.25)*0.15=0.378*0.5*0.15 = 0.02835   prev-state : sunny
phase 3(soggy):
same as phase 2

hidden markov model (evaluation)

application:
檢測已知模型是否準確
1、隐藏状态 (天气):Sunny,Cloudy,Rainy;
2、观察状态(海藻湿度):Dry,Dryish,Damp,Soggy;
3、初始状态概率: Sunny(0.63), Cloudy(0.17), Rainy(0.20);
4、状态转移矩阵:
today weather
sunny
cloudy
rainy
yesterday weather
sunny
0.5
0.375
0.125
cloudy
0.25
0.125
0.625
rainy
0.25
0.375
0.375
5、混淆矩阵:
observed states
dry
dryish
damp
soggy
hidden states
sunny
0.6
0.2
0.15
0.05
cloudy
0.25
0.25
0.25
0.25
rainy
0.05
0.1
0.35
0.5

calculate the probability of observed state(dry,damp,soggy) based on the hidden model pattern we using

sunny
sunny
sunny
cloudy
cloudy
cloudy
rainy
rainy
rainy
dry
damp
soggy
1.brute force
p((dry,damp,soggy)|hmm) = p(dry|sunny,damp|sunny,soggy|sunny)+p(dry|sunny,damp|sunny,soggy|cloudy)+p(dry|sunny,damp|sunny,soggy|rainy)+….
2.forward (dynamic programming)
calculate probabilities in  each phase based on hidden states transition, sum up as a new hidden state probability
phase 1:
p(sunny) = initial hidden state * p(dry|sunny)=0.63 *0.6 = 0.378
p(rainy) =  initial hidden state * p(dry|rainy) = 0.17 * 0.05 = 0.0425
p(cloudy) = initial hidden state * p(dry|cloudy) = 0.2 * 0.25 =0.01
phase 2:
p(sunny) = (p(previous phase sunny) * p(sunny|sunny)+p(previous  phase rainy) * p(sunny|rainy)+p(previous phase cloudy) * p(sunny|cloudy) ) * p(damp|sunny) = (0.378*0.5+0.0425*0.25+0.01*0.25)*0.15 = 0.30319
phase 3:
same calculation as phase 2