Misplaced Pages

C10k problem: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editContent deleted Content addedVisualWikitext
Revision as of 12:21, 12 February 2014 edit200.120.73.176 (talk) It doesn't refer to that, it is that; parentheses are for asides, not for directly relevant article content; citations go after the facts, not before them.← Previous edit Latest revision as of 15:54, 31 October 2024 edit undoThe Anome (talk | contribs)Edit filter managers, Administrators253,495 edits History: gr.Tag: 2017 wikitext editor 
(110 intermediate revisions by 60 users not shown)
Line 1: Line 1:
{{Short description|Problem of optimising network sockets to handle a large number of clients at the same time}}
The '''C10k problem''' is the problem of optimising ]s to handle a large number of clients at the same time.<ref name=C10K>{{cite web |url=http://www.kegel.com/c10k.html |title=The C10K problem |work= |archivedate=2013-07-28|archiveurl=http://www.webcitation.org/6ICibHuyd}}</ref> The name C10k is a ] for ]ly handling ten thousand connections).<ref name=Liu-Deters>{{cite doi| 10.1007/978-3-642-01247-1_16}}</ref> The problem of socket server optimisation has been studied because a number of factors must be considered to allow a web server to support many clients. This can involve a combination of operating system constraints and web server software limitations. According to the scope of services to be made available and the capabilities of the O.S. as well as hardware considerations such as multi-processing capabilities, a multi-threading model or a single-threading model can be preferred. Concurrently with this aspect which involves considerations regarding memory management (usually O.S. related), strategies implied relate to the very diverse aspects of the I/O management.<ref name=Liu-Deters />


The '''C10k problem''' is the problem of optimizing ]s to handle a large number of clients at the same time.<ref name=C10K>{{Cite web|url=http://www.kegel.com/c10k.html |title=The C10K problem |archive-date=2013-07-22 |archive-url=https://web.archive.org/web/20130722134723/http://www.kegel.com/c10k.html |url-status=live }}</ref> The name C10k is a ] for ] handling ten thousand connections.<ref name=Liu-Deters>{{Cite book | last1 = Liu | first1 = D. | last2 = Deters | first2 = R. | chapter = The Reverse C10K Problem for Server-Side Mashups | doi = 10.1007/978-3-642-01247-1_16 | title = Service-Oriented Computing – ICSOC 2008 Workshops | series = Lecture Notes in Computer Science | volume = 5472 | pages = 166 | year = 2009 | isbn = 978-3-642-01246-4 }}</ref> Handling many concurrent connections is a different problem from handling many ]: the latter requires high throughput (processing them quickly), while the former does not have to be fast, but requires efficient scheduling of connections.
== Servers which address the problem ==
Several web servers have been developed to counter the C10K problem:
* ], which relies on an event-driven (]) architecture, instead of ]s, to handle requests (] uses<ref></ref> nginx to solve the C10K problem)<ref></ref>
* ], which relies on an asynchronous architecture to handle requests<ref></ref>
* ], a lightweight web server<ref></ref>
* ], a non-blocking web server and ]<ref></ref> written in ] (used by Facebook's ])
* (retired, formerly Apache Deft), asynchronous, non-blocking web server running on the ]
* ], a NIO client server framework which enables quick and easy development of network applications such as protocol servers and clients<ref></ref>
* ], a real time web application framework written in ].
* ], asynchronous, non-blocking web server running on ] JavaScript engine<ref></ref>
* ], an asynchronous, non-blocking web server running on Ruby
* ], a web server written in ]; profiting from Erlang's extremely lightweight processes.
* Cowboy (web server), another very lightweight web server written in ]<ref>https://github.com/extend/cowboy extend/cowboy</ref>
* asyncore (in the standard ] library), a non-blocking web server library. It is based on Medusa, which is no longer maintained.
* , asynchronous, high performance cooperative threading library for ].
* ], Microsoft's flagship web server, through the use of asynchronous requests, as demonstrated by third-party components such as
* ] asynchronous Java servlet container
* ], an extremely FTP server written in ]
* , an async and clustered Scala web framework and HTTP(S) server based on ]
* ], some research is being done using the asynchronous IO support in Python 3.3.<ref>https://github.com/aaugustin/django-c10k-demo</ref>
* vibe.d (web framework), a simple asynchronous I/O web framework written in ]<ref>http://vibed.org</ref>
* , event-driven fibers ("ribbons") - "blocking" code simply becomes non-blocking.


The problem of socket server optimisation has been studied because a number of factors must be considered to allow a web server to support many clients. This can involve a combination of operating system constraints and web server software limitations. According to the scope of services to be made available and the capabilities of the operating system as well as hardware considerations such as multi-processing capabilities, a multi-threading model or a ] model can be preferred. Concurrently with this aspect, which involves considerations regarding memory management (usually operating system related), strategies implied relate to the very diverse aspects of I/O management.<ref name=Liu-Deters />
There is a benchmark done for comparing the performance of various web frameworks supporting c10k solutions.<ref>http://maxim.livejournal.com/392971.html</ref>

== History ==
The term ''C10k'' was coined in 1999 by software engineer Dan Kegel,{{r|aosa2:nginx}}<ref name = "Dan Kegel, kegel.com, 1999" /> citing the ] FTP host, ], serving 10,000 clients at once over 1 ] ] in that year.<ref name="C10K" /> The term has since been used for the general issue of large number of clients, with similar numeronyms for larger number of connections, most recently "C10M" in the 2010s to refer to 10 million concurrent connections.<ref name="C10M">{{Cite web|url=https://migratorydata.com/blog/migratorydata-solved-the-c10m-problem/|title=How MigratoryData solved the C10M problem: 10 Million Concurrent Connections on a Single Commodity Server|website=migratorydata.com|language=en|date=2015-05-20|access-date=2021-10-15|author=Mihai Rotaru}}</ref>

By the early 2010s millions of connections on a single commodity 1U rackmount server became possible: over 2 million connections (], 24 cores, using ] on ])<ref name = "WhatsApp blog, 2012" > {{ Cite web | url = https://blog.whatsapp.com/196/1-million-is-so-2011 | title = 1 million is so 2011 | access-date = 25 July 2019 | date = 6 January 2012 | website = ] blog | quote = This time we also wanted to share some more technical details with you about hardware, OS and software: hw.machine: amd64 hw.model: Intel(R) Xeon(R) CPU X5675 @ 3.07GHz hw.ncpu: 24 hw.physmem: 103062118400 hw.usermem: 100556451840 | archive-url = https://web.archive.org/web/20140501234954/https://blog.whatsapp.com/196/1-million-is-so-2011 | archive-date = 1 May 2014 | df = dmy-all }} </ref><ref name = "Reed, Erlang Factory, 2012" > {{ Cite web | url = http://www.erlang-factory.com/upload/presentations/558/efsf2012-whatsapp-scaling.pdf | title = Scaling to Millions of Simultaneous Connections | access-date = 25 July 2019 | first = Rick | last = Reed | date = 30 March 2012 | website = Erlang Factory | page = 7 | archive-url = https://web.archive.org/web/20120709235656/http://www.erlang-factory.com/upload/presentations/558/efsf2012-whatsapp-scaling.pdf | archive-date = 9 July 2012 | df = dmy-all }} </ref> and 10–12 million connections (MigratoryData, 12 cores, using ] on ]).<ref name="C10M" /><ref name="C10M-howto">{{Cite web|url=https://migratorydata.com/blog/migratorydata-with-12-million-concurrent-websockets/|title=Scaling to 12 Million Concurrent Connections: How MigratoryData Did It|website=migratorydata.com|language=en|date=2013-10-10|access-date=2021-10-15|author=Mihai Rotaru}}</ref>

Common applications of very high numbers of connections include general public servers that have to serve thousands or even millions of users at a time, such as ]s, ]s, ]s, ]s, and ].<ref name="conn-very-high-file">{{Cite book|url=https://books.google.com/books?id=cNwZ1snBYQYC&dq=file+server+very+high+number+of+connections&pg=PA470|title=High Performance Computing - HiPC 2008|language=en|year=2008|access-date=2021-10-15|author1=Ponnuswamy Sadayappan|author2=Manish Parashar|author3=Ramamurthy Badrinath|author4=Viktor K. Prasanna|publisher=Springer |isbn=978-3-540-89893-1}}</ref><ref name="C10M" />


== See also == == See also ==
*]
*]
*] *]
*] *]
Line 33: Line 19:


== References == == References ==

{{Reflist}} {{Reflist|2|refs=
<ref name=aosa2:nginx>{{cite book
|author= Andrew Alexeev
|section-url= http://www.aosabook.org/en/nginx.html
|section= §14. nginx; §14.1. Why Is High Concurrency Important?
|editor1= Amy Brown |editor2= Greg Wilson
|url= http://aosabook.org/en/index.html#aosa2
|title= The Architecture of Open Source Applications, Volume II: Structure, Scale and a Few More Fearless Hacks
|publisher= ] |publication-date= 2012 |isbn= 9781105571817
|quote= Around ten years ago, Daniel Kegel, a prominent software engineer, … Kegel's C10K manifest … solving the C10K problem of 10,000 simultaneous connections, ] …
}}</ref>
<ref name = "Dan Kegel, kegel.com, 1999" > {{ Cite web | url = http://www.kegel.com/c10k.html | title = The C10K problem | access-date = 18 June 2019 | first = Dan | last = Kegel | date = 8 May 1999 | website = Kegel com | quote = <nowiki>And computers are big, too. You can buy a 500MHz machine with 1 gigabyte of RAM and six 100Mbit/sec Ethernet card for $3000 or so. Let's see - at 10000 clients, that's 50KHz, 100Kbytes, and 60Kbits/sec per client. It shouldn't take any more horsepower than that to take four kilobytes from the disk and send them to the network once a second for each of ten thousand clients. (That works out to $0.30 per client, by the way. Those $100/client licensing fees some operating systems charge are starting to look a little heavy!) So hardware is no longer the bottleneck.</nowiki> | archive-url = https://web.archive.org/web/19990508164301/http://www.kegel.com/c10k.html | archive-date = 8 May 1999 | df = dmy-all }} </ref>

}}


] ]
]

Latest revision as of 15:54, 31 October 2024

Problem of optimising network sockets to handle a large number of clients at the same time

The C10k problem is the problem of optimizing network sockets to handle a large number of clients at the same time. The name C10k is a numeronym for concurrently handling ten thousand connections. Handling many concurrent connections is a different problem from handling many requests per second: the latter requires high throughput (processing them quickly), while the former does not have to be fast, but requires efficient scheduling of connections.

The problem of socket server optimisation has been studied because a number of factors must be considered to allow a web server to support many clients. This can involve a combination of operating system constraints and web server software limitations. According to the scope of services to be made available and the capabilities of the operating system as well as hardware considerations such as multi-processing capabilities, a multi-threading model or a single threading model can be preferred. Concurrently with this aspect, which involves considerations regarding memory management (usually operating system related), strategies implied relate to the very diverse aspects of I/O management.

History

The term C10k was coined in 1999 by software engineer Dan Kegel, citing the Simtel FTP host, cdrom.com, serving 10,000 clients at once over 1 gigabit per second Ethernet in that year. The term has since been used for the general issue of large number of clients, with similar numeronyms for larger number of connections, most recently "C10M" in the 2010s to refer to 10 million concurrent connections.

By the early 2010s millions of connections on a single commodity 1U rackmount server became possible: over 2 million connections (WhatsApp, 24 cores, using Erlang on FreeBSD) and 10–12 million connections (MigratoryData, 12 cores, using Java on Linux).

Common applications of very high numbers of connections include general public servers that have to serve thousands or even millions of users at a time, such as file servers, FTP servers, proxy servers, web servers, and load balancers.

See also

References

  1. ^ "The C10K problem". Archived from the original on 2013-07-22.
  2. ^ Liu, D.; Deters, R. (2009). "The Reverse C10K Problem for Server-Side Mashups". Service-Oriented Computing – ICSOC 2008 Workshops. Lecture Notes in Computer Science. Vol. 5472. p. 166. doi:10.1007/978-3-642-01247-1_16. ISBN 978-3-642-01246-4.
  3. Andrew Alexeev (2012). "§14. nginx; §14.1. Why Is High Concurrency Important?". In Amy Brown; Greg Wilson (eds.). The Architecture of Open Source Applications, Volume II: Structure, Scale and a Few More Fearless Hacks. Lulu.com. ISBN 9781105571817. Around ten years ago, Daniel Kegel, a prominent software engineer, … Kegel's C10K manifest … solving the C10K problem of 10,000 simultaneous connections, nginx
  4. Kegel, Dan (8 May 1999). "The C10K problem". Kegel com. Archived from the original on 8 May 1999. Retrieved 18 June 2019. And computers are big, too. You can buy a 500MHz machine with 1 gigabyte of RAM and six 100Mbit/sec Ethernet card for $3000 or so. Let's see - at 10000 clients, that's 50KHz, 100Kbytes, and 60Kbits/sec per client. It shouldn't take any more horsepower than that to take four kilobytes from the disk and send them to the network once a second for each of ten thousand clients. (That works out to $0.30 per client, by the way. Those $100/client licensing fees some operating systems charge are starting to look a little heavy!) So hardware is no longer the bottleneck.
  5. ^ Mihai Rotaru (2015-05-20). "How MigratoryData solved the C10M problem: 10 Million Concurrent Connections on a Single Commodity Server". migratorydata.com. Retrieved 2021-10-15.
  6. "1 million is so 2011". WhatsApp blog. 6 January 2012. Archived from the original on 1 May 2014. Retrieved 25 July 2019. This time we also wanted to share some more technical details with you about hardware, OS and software: hw.machine: amd64 hw.model: Intel(R) Xeon(R) CPU X5675 @ 3.07GHz hw.ncpu: 24 hw.physmem: 103062118400 hw.usermem: 100556451840
  7. Reed, Rick (30 March 2012). "Scaling to Millions of Simultaneous Connections" (PDF). Erlang Factory. p. 7. Archived from the original (PDF) on 9 July 2012. Retrieved 25 July 2019.
  8. Mihai Rotaru (2013-10-10). "Scaling to 12 Million Concurrent Connections: How MigratoryData Did It". migratorydata.com. Retrieved 2021-10-15.
  9. Ponnuswamy Sadayappan; Manish Parashar; Ramamurthy Badrinath; Viktor K. Prasanna (2008). High Performance Computing - HiPC 2008. Springer. ISBN 978-3-540-89893-1. Retrieved 2021-10-15.
Categories: