In some circumstances we meet words that contains Latin charactors, like the word naïve , especially in names like simão. For some reason, we want to translate them to naive, and simao, or at least we can know that they are equal to naive or simao respectively.
I’ve met such a problem recently, and I try to find a solution to this, but honestly, it’s hard to describe such a question. When I browse all the posible related pages which google shows, I find a function in ES6 called normalize which finally helped me out. If you look at the description of the argument of this function, and finally trace to the concept of the so called Canonical Decomposition, you probably would WOW out like me do. Yes, it’s exactly what we want, we want È É Ê Ë being equal to E, want ìíîï all equal to i. Now with this function, we can easily solve the problem with the help of this function, and I’m really glad I found the solution even before I know how to describe the problem.
So here is the utility function
1 2 3
function convertLatin(str) { return str.normalize("NFD").match(/\w/g).join(""); }
And you’ll find that convertLatin(“naïve”) === “naive”, and convertLatin(“simão”) === “simao”. Enjoy this small utility!
##Update(26 Mar 2017)
After some close investigation, I found the solution above is neither robust nor necessary, as ES6 have provided official support for this user case, please check the new API of String.prototype.localeCompare Intl.Collator
首先澄清初始问题,依据12-factors的要求:Make the tools gap small: keep development and production as similar as possible. 现在看来这里要求的开发和生产环境的一致性其实是在要求CI/CD环境的一致性,或者具体点,是Jenkins Build Node和Production Node的一致性。由于容器技术的引入,其实小环境一致性的要求是自然满足的,而Node系统以及Docker-Engine本身(版本的)一致性,可以很容易地通过复用诸如Chef recipes或者Openstack Heat Template来达到,至于容器编排器的不一致性 – 譬如在Jenkins Slave上用的是docker-compose而在生产环境中使用Kubernetes – 可能造成的问题,我想是Operation需要解决的,毕竟DevOps的存在只是为了让Ops日子好过点,而不是让Ops失去存在的必要。
老板让尝试用node搭建一个高TPS的http服务器,业务不重要,仅仅测试一下传说中的适合I/O的技术能比java web container好多少.英文版测试结果:
Tried several approach to increase the TPS of a node.js http server to check if it’s competitive to be a easy tool for some specific tasks. I create a simple http server, based on nodejs native http server. It receives http requests, records its information into a (remote) mongo DB, then response with ‘Okey’.
Test tool is Apache Bench, installed in the same host machine with http server: a desktop of “Dell OptiPlex 7010” with 8 core CPU as well as 8G memory running Oracle Linux Server 6.8.
Optimization approaches include:
Increasing the host server’s ‘open files limit’ with “ulimit -n 99999” while the default is 1024, also increasing the default stack size for V8 in node with '--max-old-space-size 2048'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[root@pu ~]# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 31197
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 99999
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 31197
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Re-use TCP connections for successive requests, i.e. make use of the Keep-Alive feature of http(1.0)
Make the http server a cluster, make use of more cores of CPU.
Change the business logic, return immediately when receiving request instead of waiting for database finish recording.
OS level tuning
Increasing the max open files(and hence the sockets) as well as stack size didn’t improve the performance. Which means we haven’t reached the limit of parallel socket numbers, nor memory limit.
Reuse connection
The http header ‘Connection: keep-alive’ is needed for Http/1.0 to reuse connection for further request— while for Http/1.1, the connection is keep-alive by default. Apache Bench is using 1.0, and with a parameter “-k”, it will add the “keep-alive” header. As Http/1.0 can’t make use of ‘Transfer-encoding: chunked’, there’s only one possible way for the client to determine the boundary of successive requests in a single connection, i.e. ‘Content-Length’, it’s easy to know the content-length when requesting static file, but for the case of dynamic page, we need to manually calculate the ‘Content-Length’ and then mark it in the response header. And this is what we do by adding code in the node.js http server. By doing this, the throughput increased:
1
2
3
4
5
6
[root@pu ~]# ab -n 10000 -c 10 http://localhost:1337/
Concurrency Level: 10
Time taken for tests: 1.512 seconds
[root@pu ~]# ab -n 10000 -c 10 -k http://localhost:1337/
Concurrency Level: 10
Time taken for tests: 1.144 seconds
Introduce concurrency
It contains two aspects when we introduce concurrency: Adding the concurrency level of the test client Adding the concurrency level of the http server Since we are doing test in the exact server where http server deploys, the bottleneck can shift between client and server. So adding the concurrency level blindly won’t always increase the performance. Adding the concurrency of Apache Bench is easy, just increase the parameter value of “-c”, adding this value will increase the TPS, but only valid in a certain range, approximately 1-50, in this range increase concurrency level will increase TPS, but out of this range, the TPS didn’t increase, –and also won’t decrease. For example if you increase the concurrency level to a non-sense high value, it won’t increase the TPS as 50. To add the concurrency of Nodejs Http Server, we use node’s build-in feature of ‘cluster’, creating several slaves to strive for a single port. After several tuning, I find the concurrent level of 4 slaves increases the performance better, unlike the Apache Bench, adding the concurrency level of Http Server bigger than 4 will cause the total TPS decreased– this is because it will occupy the CPU resources that used for Apache Bench.
1
2
3
[root@pu ~]# ab -n 10000 -c 100 -k http://localhost:1337/
Concurrency Level: 100
Time taken for tests: 0.794 seconds
It is argued that several slaves striving for the same port is not so efficient than four slaves listening to different port respectively, and in the front, adding a inverse-proxy like Nginx to balance the load. This approach is not tried yet.
Change business logic
I tried to remove the code snippet of writing to mongo db, and then test it. In this situation, node.js server has the same TPS as Apache Httpd server.
1
2
3
[root@pu ~]# ab -n 10000 -c 100 -k http://localhost:1337/
Concurrency Level: 100
Time taken for tests: 0.251 seconds
So for static page, nodejs is not so powerful, it’s value lies in when the business logic added, the TPS won’t drop down rapidly.
Future test
Stability: as last time I tried this http server, it shows periodically TPS down, probably related with V8’s GC, so need to investigate into more detail about it.