Misplaced Pages

Draft:WKdm: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editContent deleted Content addedVisualWikitext
Revision as of 16:13, 11 January 2024 editJdbtwo (talk | contribs)Extended confirmed users695 edits Added intra-wiki linksTag: 2017 wikitext editor← Previous edit Latest revision as of 22:45, 5 January 2025 edit undoBD2412 (talk | contribs)Autopatrolled, IP block exemptions, Administrators2,455,010 edits Removing deletion tags after undeletion (rfud-helper)Tag: Manual revert 
(30 intermediate revisions by 6 users not shown)
Line 1: Line 1:
{{AFC submission|d|nn|u=Jdbtwo|ns=118|decliner=S0091|declinets=20240625151134|reason2=adv|ts=20240302160430}} <!-- Do not remove this line! -->
{{Short description|Virtual memory compression algorithm}}
{{AFC submission|d|v|u=Jdbtwo|ns=118|decliner=MicrobiologyMarcus|declinets=20240302125113|small=yes|ts=20240108205048}} <!-- Do not remove this line! -->
{{Draft topics|computing}}
{{AFC submission|d|nn|u=Jdbtwo|ns=118|decliner=Lewcm|declinets=20240108185251|small=yes|ts=20240106170528}} <!-- Do not remove this line! -->
{{AfC topic|other}}
{{AfC submission|||ts=20240108205048|u=Jdbtwo|ns=118}}
{{AFC submission|d|nn|u=Jdbtwo|ns=118|decliner=Lewcm|declinets=20240108185251|ts=20240106170528}} <!-- Do not remove this line! -->


{{AFC comment|1=Also, Github is not a reliable source and see ]. ] (]) 15:11, 25 June 2024 (UTC)}}


{{AFC comment|1=still some largely unsourced sections. ] <sup>]·]'']</sup> 12:51, 2 March 2024 (UTC)}}


----
The '''WKdm''' algorithm is one of the first in the class of '''WK''' virtual memory compression techniques developed initially by Paul R. Wilson and Scott F. Kaplan et al. circa 1999.<ref name="CaseForCompressedCaching">{{cite conference |url=https://www.usenix.org/legacy/event/usenix99/full_papers/wilson/wilson.pdf |title=The Case for Compressed Caching in Virtual Memory Systems |author-last1=Wilson |author-first1=Paul R. |author-last2=Kaplan |author-first2=Scott F. |author-last3=Smaragdakis |author-first3=Yannis |date= 1999-06-06 |conference=USENIX Annual Technical Conference |location=Monterey, California, USA |pages=101–116}}</ref> The "'''dm'''" in the WKdm acronym stands for "direct mapped" and refers to the <!--direct mapped hash-->] method used to map uncompressed words in memory to the WKdm algorithm's dictionary.

{{Short description|Virtual memory compression algorithm}}
{{Draft topics|media|computing}}
{{AfC topic|other}}

{{Use list-defined references}}
The '''WKdm''' algorithm is one of the first in the class of '''WK''' virtual memory compression techniques developed initially by Paul R. Wilson and Scott F. Kaplan et al. circa 1999.<ref name="CaseForCompressedCaching" /> The "'''dm'''" in the WKdm acronym stands for "direct mapped" and refers to the <!--direct mapped hash-->] method used to map uncompressed words in memory to the WKdm algorithm's dictionary.<ref name="CaseForCompressedCaching" /><ref name="simpson1" /><ref name="CCache1" /><ref name="xnu2" /><ref name="xnu3" />


==Motivation== ==Motivation==


The key insight on which the WKdm algorithm is built is the realization that most <!--high-level programming languages-->] compile to output whose data section(s) have certain very strong data regularities with regard to integers and pointers.<ref name="CaseForCompressedCaching" /><ref name="simpson1">{{cite web |url=https://terpconnect.umd.edu/~barua/matt-compress-tr.pdf |title=Analysis of Compression Algorithms for Program Data |last=Simpson |first=Matthew |date= 12 Aug 2003 |website=umd.edu |access-date= 6 Jan 2024}}</ref> Firstly, a large amount of integers and pointers are word-aligned within records, where "word" here and for the rest of this article is 32 bits. Additionally, most integers usually contain small values relative to their maximum ranges. For pointers, most proximal in memory to each other reference addresses that are close to each other in memory. Finally, certain data patterns, particularly words of all zeroes, frequently occur and this is exploited by the algorithm. The key insight on which the WKdm algorithm is built is the realization that most <!--high-level programming languages-->] compile to output whose data section(s) have certain very strong data regularities with regard to integers and pointers.<ref name="CaseForCompressedCaching" /><ref name="simpson1" /><ref name="CCache1" /> Firstly, a large amount of integers and pointers are word-aligned within records, where "word" here will henceforth refer to 32 bits.<ref name="CaseForCompressedCaching" /><ref name="simpson1" /><ref name="CCache1" /> Additionally, most integers usually contain small values relative to their maximum ranges.<ref name="CaseForCompressedCaching" /><ref name="simpson1" /><ref name="CCache1" /> For pointers, most proximal in memory to each other reference addresses that are close to each other in memory.<ref name="CaseForCompressedCaching" /><ref name="simpson1" /><ref name="CCache1" /> Finally, certain data patterns, particularly words of all zeroes, frequently occur and this is exploited by the algorithm.<ref name="CaseForCompressedCaching" /><ref name="simpson1" /><ref name="CCache1" />


To make use of the above data regularities, one need only realize that, frequently, words will share many of their high-order bits either because they aren’t large enough to require a full-word bit width, or, said words are pointers whose values reference addresses close in memory to those referenced by nearby pointers. Also, words of all zeroes, which occur frequently, can be easily compressed. To make use of the above data regularities, one need only realize that, frequently, words will share many of their high-order bits either because they aren't large enough to require a full-word bit width, or, said words are pointers whose values reference addresses close in memory to those referenced by nearby pointers.<ref name="CaseForCompressedCaching" /><ref name="simpson1" /><ref name="CCache1" /> Also, words of all zeroes, which occur frequently, can be easily compressed.<ref name="CaseForCompressedCaching" /><ref name="simpson1" /><ref name="CCache1" />


==Algorithm== ==Algorithm==
Line 19: Line 26:
===Compression=== ===Compression===


The WKdm algorithm reads one word at a time from an address range, usually a <!--page-->] or pages, and uses a 16-entry direct mapped dictionary of words to produce compressed output which is segregated into four arrays or "segments" which contain, respectively, "tags" ( 2-bit values indicating the type of (non)match ), dictionary indices, unmatched words and the lower 10 bits of partially matched words. The tag, index and partial match values are initially output into bytes or words in their respective segments, before being “packed” after the number of words in the addresses range to be compressed is exhausted.<ref name="CaseForCompressedCaching" /> The WKdm algorithm reads one word at a time from an address range, usually a <!--page-->] or pages, and uses a 16-entry direct mapped dictionary of words to produce compressed output which is segregated into four arrays or "segments" which contain, respectively, "tags" ( 2-bit values indicating the type of (non)match ), dictionary indices, unmatched words and the lower 10 bits of partially matched words.<ref name="CaseForCompressedCaching" /><ref name="xnu2" /><!--<ref name="xnu3" />--> The tag, index and partial match values are initially output into bytes or words in their respective segments, before being “packed” after the number of words in the addresses range to be compressed is exhausted.<ref name="CaseForCompressedCaching" /><ref name="xnu2" /><!--<ref name="xnu3" />-->


For each word read, the word is mapped to the dictionary using a direct-mapped hash, and then the type of (non)match is determined. If a full 32-bit word match is found in the dictionary, then a 2-bit "tag" value indicating a full-word match is written to the tags segment and the 4-bit index of the match within the dictionary is written to the indices segment. If only the high-order 22 bits match, then a different tag is written to the tags segment, the dictionary index of the partial match is output to the indices segment and the differing 10 lower-order bits are recorded in the partial match segment. If no match is found then the new value is added to the dictionary, as well as being emitted to the unmatched-words segment, and another tag signaling this is written to the tags segment. If the read word is all zeroes, then only one tag indicating this is output to the tags segment. For each word read, the word is mapped to the dictionary using a direct-mapped hash, and then the type of (non)match is determined.<ref name="CaseForCompressedCaching" /><ref name="xnu2" /><!--<ref name="xnu3" />--><ref name="simpson1" /> If a full 32-bit word match is found in the dictionary, then a 2-bit "tag" value indicating a full-word match is written to the tags segment and the 4-bit index of the match within the dictionary is written to the indices segment.<ref name="CaseForCompressedCaching" /><ref name="xnu2" /><!--<ref name="xnu3" />--> If only the high-order 22 bits match, then a different tag is written to the tags segment, the dictionary index of the partial match is output to the indices segment and the differing 10 lower-order bits are recorded in the partial match segment.<ref name="CaseForCompressedCaching" /><ref name="xnu2" /><!--<ref name="xnu3" />--> If no match is found then the new value is added to the dictionary, as well as being emitted to the unmatched-words segment, and another tag signaling this is written to the tags segment.<ref name="CaseForCompressedCaching" /><ref name="xnu2" /><!--<ref name="xnu3" />--> If the read word is all zeroes, then only one tag indicating this is output to the tags segment.<ref name="CaseForCompressedCaching" /><ref name="xnu2" /><!--<ref name="xnu3" />-->


After all the words in the address range to be compressed have been read, the tags, indices and 10-bit partial match values, which are stored in bytes or words in their segments, are "packed" within their respective segments ( eg. their bits are made contiguous if their particular segment is taken to be one large bit vector. Additional steps may be taken — the exact details are implementation specific ) using <!--bitwise-->] operations to further reduce the compressed size of the data. After all the words in the address range to be compressed have been read, the tags, indices and 10-bit partial match values, which are stored in bytes or words in their segments, are "packed" within their respective segments ( e.g. their bits are made contiguous if their particular segment is taken to be one large bit vector. Additional steps may be taken — the exact details are implementation specific ) using <!--bitwise-->] operations to further reduce the compressed size of the data.<ref name="CaseForCompressedCaching" /><ref name="xnu2" /><!--<ref name="xnu3" />-->


===Decompression=== ===Decompression===


Decompression is quite straightforward. The tags segment is processed one 2-bit tag at a time and action is taken depending on the value of the tag. If the value indicates a full-word match, then the corresponding dictionary index within the indices segment is referenced and the value referenced by the index in the dictionary is output. If a partial match is indicated, then the corresponding entry in the indices segment is consulted to look up the value that matches the high-order 22 bits and then the partial match segment is read to reconstruct the full 32-bit word, which is written to the uncompressed output. If the current tag indicates that there was no match, then the corresponding 32-bit word in the unmatched words segment is referenced and added to the dictionary as well as being emitted as part of the uncompressed output. If the tag indicates that a word was read that was all zeroes, then a 32-bit zero value is sent to the output. Decompression is quite straightforward. The tags segment is processed one 2-bit tag at a time and action is taken depending on the value of the tag.<ref name="CaseForCompressedCaching" /><!--<ref name="xnu2" />--><ref name="xnu3" /> If the value indicates a full-word match, then the corresponding dictionary index within the indices segment is referenced and the value referenced by the index in the dictionary is output.<ref name="CaseForCompressedCaching" /><!--<ref name="xnu2" />--><ref name="xnu3" /> If a partial match is indicated, then the corresponding entry in the indices segment is consulted to look up the value that matches the high-order 22 bits and then the partial match segment is read to reconstruct the full 32-bit word, which is written to the uncompressed output.<ref name="CaseForCompressedCaching" /><!--<ref name="xnu2" />--><ref name="xnu3" /> If the current tag indicates that there was no match, then the corresponding 32-bit word in the unmatched words segment is referenced and added to the dictionary as well as being emitted as part of the uncompressed output.<ref name="CaseForCompressedCaching" /><!--<ref name="xnu2" />--><ref name="xnu3" /> If the tag indicates that a word was read that was all zeroes, then a 32-bit zero value is sent to the output.<ref name="CaseForCompressedCaching" /><!--<ref name="xnu2" />--><ref name="xnu3" />


==Performance== ==Performance==


From the authors' simulations and others',<ref name="CaseForCompressedCaching" /><ref name="simpson1" /> it was found that WKdm compression achieves a compression ratio comparable or superior to <!--LZ-->]-based dictionary compressors -- 2:1 to 2.5:1 . The WKdm algorithm also has much less overhead than an LZ-class compressor as it only uses a dictionary that is 64 bytes in size as compared to eg. 64 kilobytes. Furthermore, because of the simplicity of the algorithm, compression and decompression is usually much faster than traditional LZ-based compressors. Tests and real world performance data show<ref name="CaseForCompressedCaching" /><ref name="simpson1" /><ref name="CCache1" /><ref name="acm2019-compress" /> that WKdm compression achieves a compression ratio comparable or superior to <!--LZ-->]-based dictionary compressors<!-- &mdash; 2:1 to 2.5:1 not verified -->.<ref name="Phoronix1" /><ref name="AMMC1" /> The WKdm algorithm also has much less overhead than an LZ-class compressor as it only uses a dictionary that is 64 bytes in size as compared to eg. 64 kilobytes.<ref name="CaseForCompressedCaching" /><ref name="simpson1" /><ref name="CCache1" /> Furthermore, because of the simplicity of the algorithm, compression and decompression is usually much faster than traditional LZ-based compressors.<ref name="CaseForCompressedCaching" /><ref name="simpson1" /><ref name="CCache1" />

==Variants==

The original authors of the WKdm algorithm also developed the so-called "WK4x4" algorithm.<ref name="CaseForCompressedCaching" /><ref name="CCache1" /> This variation on the algorithm uses a 4-way ] instead of a direct mapped hash for the dictionary.<ref name="CaseForCompressedCaching" /><ref name="CCache1" /> The performance, though, was shown to be the same or worse than WKdm in most cases.<ref name="CaseForCompressedCaching" /><ref name="CCache1" /><!-- <ref name="simpson1" /> -->

Matthew Simpson et al. developed a variation on the original WKdm algorithm named "WKS" in which the compression is performed in-place without the need of any temporary arrays or "segments."<ref name="simpson1" /> This cuts down on the temporary memory requirements. Also, the algorithm prevents compressed data expansion due to its ability to detect incompressible data.<ref name="simpson1" /> Furthermore, WKS extends what is considered a partial match for a given word.<ref name="simpson1" />


==Notable implementations== ==Notable implementations==


WKdm compression has been used by Apple's OSX since version 10.9 Mavericks in 2013<ref name="Arstechnica">{{Cite web|url=https://arstechnica.com/apple/2013/10/os-x-10-9/17/#compressed-memory|title=OS X 10.9 Mavericks: The Ars Technica Review|date=22 October 2013}}</ref><ref>{{cite web |url=https://images.apple.com/media/us/osx/2013/docs/OSX_Mavericks_Core_Technology_Overview.pdf |title=OS X Mavericks Core Technologies Overview October 2013 |author=<!--Not stated--> |date=Oct 2013 |website=apple.com |access-date=9 Jan 2024}}</ref><ref>{{cite web |url=https://www.apple.com/osx/all-features/pdf/osx_elcapitan_core_technologies_overview.pdf |title=OS X El Capitan Core Technologies Overview September 2015 |author=<!--Not stated--> |date=Sep 2015 |website=apple.com |access-date=9 Jan 2024}}</ref> and also in the shared source of the ] ] operating system / kernel.<ref>{{cite web |url=https://www.sciencedirect.com/science/article/pii/S1742287614000541 |title=In lieu of swap: Analyzing compressed RAM in Mac OS X and Linux |last=G. Richard III |first=Golden |last2=Case |first2=Andrew |date=17 Jul 2014 |website=sciencedirect.com |publisher= |access-date=11 Jan 2024 |quote=}}</ref> WKdm compression is also used in the ] Linux kernel.<ref>{{cite web |url=https://people.ac.upc.edu/vbeltran/papers/WAHA2009.pdf |title=Accelerating Software Memory Compression on the Cell/B.E. |last=Beltran |first=Vicenc¸ |last2=Martorell |first2=Xavier |last3=Torres |first3=Jordi |last4=Ayguad´e |first4=Eduard |date=<!--unclear--> |website=upc.edu |publisher= |access-date=11 Jan 2024 |quote=}}</ref><ref>{{cite web |url=https://marc.info/?l=olpc-linux-mm-cc&m=117972870310131&w=2 |title= Compression structure implementation |last=Gupta |first=Nitin |date=7 Jul 2006 |website=marc.info |access-date=9 Jan 2024}}</ref> A new implementation of the Linux virtual-memory manager, called "CCache", has also been demonstrated to work with the WKdm algorithm and its variants on a ] running on a ] processor. <ref>{{cite web |url=https://kernel.org/doc/mirror/ols2007v1.pdf#page=53 |title=Evaluating effects of cache memory compression on embedded systems WKdm compression has been used by Apple's OSX since version 10.9 Mavericks in 2013<ref name="Arstechnica" /><ref name ="ap1" /><ref name="ap2" /><ref name="CNET1" /> and also in the shared source of the ] ] operating system / kernel.<ref name="xnu1" /><ref name="xnu2 " /><ref name="xnu3" /> WKdm compression has also been implemented in the ] Linux kernel.<ref name="olpc1" /><ref name="olpc2" /> A new implementation of the Linux virtual-memory manager, called "CCache", has also been demonstrated to work with the WKdm algorithm and its variants on a ] running on a ] processor.<ref name="CCache1" />
|last=Farias Briglia |first=Anderson |last2=Bezerra
|first2=Allan |last3=Moiseichuk |first3=Leonid |last4=Gupta |first4=Nitin |date=27 Jun 2007 |website=kernel.org |publisher= |access-date=11 Jan 2024 |quote=}}</ref>


==See also== ==See also==
Line 45: Line 56:
== References == == References ==
<!-- Inline citations added to your article will automatically display here. See en.wikipedia.org/WP:REFB for instructions on how to add citations. --> <!-- Inline citations added to your article will automatically display here. See en.wikipedia.org/WP:REFB for instructions on how to add citations. -->
{{Reflist|refs=
{{reflist}}
<ref name="CaseForCompressedCaching">{{cite conference |url=https://www.usenix.org/legacy/event/usenix99/full_papers/wilson/wilson.pdf |title=The Case for Compressed Caching in Virtual Memory Systems |author-last1=Wilson |author-first1=Paul R. |author-last2=Kaplan |author-first2=Scott F. |author-last3=Smaragdakis |author-first3=Yannis |date= 1999-06-06 |conference=USENIX Annual Technical Conference |location=Monterey, California, USA |pages=101–116}}</ref>
<ref name="simpson1">{{cite journal |last1=Simpson |first1=Matthew |last2=Barua |first2=Rajeev |last3=Biswas |first3=Surupa |date=12 Aug 2023 |title=Analysis of Compression Algorithms for Program Data |url=https://terpconnect.umd.edu/~barua/matt-compress-tr.pdf |journal=University of Maryland |volume= |issue= |pages=4–14 |doi= |access-date=6 Jan 2024}}</ref>
<ref name="Arstechnica">{{Cite web|url=https://arstechnica.com/apple/2013/10/os-x-10-9/17/#compressed-memory|title=OS X 10.9 Mavericks: The Ars Technica Review|date=22 October 2013}}</ref>
<ref name="ap1">{{cite web |url=https://images.apple.com/media/us/osx/2013/docs/OSX_Mavericks_Core_Technology_Overview.pdf |title=OS X Mavericks Core Technologies Overview October 2013 |author=<!--Not stated--> |date=Oct 2013 |website=apple.com |access-date=9 Jan 2024 |pages=7}}</ref>
<ref name="ap2">{{cite web |url=https://www.apple.com/osx/all-features/pdf/osx_elcapitan_core_technologies_overview.pdf |title=OS X El Capitan Core Technologies Overview September 2015 |author=<!--Not stated--> |date=Sep 2015 |website=apple.com |access-date=9 Jan 2024 |pages=7}}</ref>
<ref name="xnu1">{{cite journal |last1=G. Richard III |first1=Golden |last2=Case |first2=Andrew |date=17 Jul 2014 |title=In lieu of swap: Analyzing compressed RAM in Mac OS X and Linux |url=https://www.sciencedirect.com/science/article/pii/S1742287614000541 |journal=Digital Investigation |volume=11 |issue= |pages=S7–S8 |doi= 10.1016/j.diin.2014.05.011|access-date=11 Jan 2024}}</ref>
<ref name="xnu2">The explanation of the ARM64 implementation and of the general WKdm compression algorithm is contained in the second block comment. Also, similar explanations can be found in the OSFMK ARM32, x64 and x86 code. {{cite web |url=https://github.com/apple/darwin-xnu/blob/main/osfmk/arm64/WKdmCompress_4k.s |title=WKdmCompress_4k.s |author=<!--Not stated--> |date=28 Sep 2017 |website=github.com |publisher= |access-date=15 Jan 2024 |quote=}}</ref>
<ref name="xnu3">The explanation of the ARM64 implementation and of the general WKdm decompression algorithm is contained in the second block comment. Also, similar explanations can be found in the OSFMK ARM32, x64 and x86 code. {{cite web |url=https://github.com/apple/darwin-xnu/blob/main/osfmk/arm64/WKdmDecompress_4k.s |title=WKdmDecompress_4k.s |author=<!--Not stated--> |date=28 Sep 2017 |website=github.com |publisher= |access-date=15 Jan 2024 |quote=}}</ref>
<ref name="olpc1">{{cite journal |last1=Beltran |first1=Vicenc¸ |last2=Martorell |first2=Xavier |last3=Torres |first3=Jordi |last4=Ayguad´e |first4=Eduard |date=2008 |title=Accelerating Software Memory Compression on the Cell/B.E. |url=https://people.ac.upc.edu/vbeltran/papers/WAHA2009.pdf |journal=First Workshop on Hardware Accelerators for High-Performance Computing |volume= |issue= |pages=2 |doi= |access-date=11 Jan 2024}}</ref>
<ref name="olpc2">{{cite web |url=https://marc.info/?l=olpc-linux-mm-cc&m=117972870310131&w=2 |title= Compression structure implementation |last=Gupta |first=Nitin |date=7 Jul 2006 |website=marc.info |access-date=9 Jan 2024}}</ref>
<ref name="CCache1">{{cite journal |last1=Farias Briglia |first1=Anderson |last2=Bezerra |first2=Allan |last3=Moiseichuk |first3=Leonid |last4=Gupta |first4=Nitin |date=2007 |title=Evaluating effects of cache memory compression on embedded systems |url=https://kernel.org/doc/mirror/ols2007v1.pdf#page=53 |journal=2007 Linux Symposium |volume= |issue= |pages=56–57 |doi= |access-date=11 Jan 2024}}</ref>
<ref name="CNET1">{{cite web |url=https://www.cnet.com/tech/computing/memory-compression-brings-ram-doubler-to-os-x-mavericks/ |title=Memory compression brings RAM Doubler to OS X Mavericks |last=Kessler |first=Topher |date=10 Jun 2013 |website=cnet.com |publisher= |access-date=3 March 2024 |quote=}}</ref>
<ref name="Phoronix1">{{cite web |url=https://www.phoronix.com/news/LZ4m-Compression |title=LZ4m: Taking LZ4 Compression To The Next Level |last=Larabel |first=Michael |date=2 Jun 2017 |website=phoronix.com |publisher= |access-date=3 Mar 2024 |quote=But LZ4m does come up short of the WKdm page compressor's compression ratio and compression speed.}}</ref>
<ref name="acm2019-compress">{{cite web |url=https://github.com/jonnor/acm2019-compress |title=Cache and Memory Compression Techniques |last1=Isaac |first1=Sean |last2=Anderson |first2=Geronimo |last3=Nordby |first3=Jon |date=19 July 2019 |website=github.com |publisher= |access-date=3 Mar 2024 |quote=}}</ref>
<ref name="AMMC1">{{cite journal |last1=Chihaia Tuduce |first1=Irina |last2=Gross |first2=Thomas |date=2005 |title=Adaptive Main Memory Compression |url=https://www.usenix.org/legacy/event/usenix05/tech/general/full_papers/tuduce/tuduce.pdf |journal=USENIX Annual Technical Conference |volume= |issue= |pages=237–250 |doi= |access-date=5 Mar 2024 |quote=For all experiments, we use the WKdm compression algorithm as it shows superior performance over other algorithms}}</ref>
}}

Latest revision as of 22:45, 5 January 2025

Submission declined on 25 June 2024 by S0091 (talk).This draft's references do not show that the subject qualifies for a Misplaced Pages article. In summary, the draft needs multiple published sources that are:
Make sure you add references that meet these criteria before resubmitting. Learn about mistakes to avoid when addressing this issue. If no additional references exist, the subject is not suitable for Misplaced Pages.This submission appears to read more like an advertisement than an entry in an encyclopedia. Encyclopedia articles need to be written from a neutral point of view, and should refer to a range of independent, reliable, published sources, not just to materials produced by the creator of the subject being discussed. This is important so that the article can meet Misplaced Pages's verifiability policy and the notability of the subject can be established. If you still feel that this subject is worthy of inclusion in Misplaced Pages, please rewrite your submission to comply with these policies.
  • If you would like to continue working on the submission, click on the "Edit" tab at the top of the window.
  • If you have not resolved the issues listed above, your draft will be declined again and potentially deleted.
  • If you need extra help, please ask us a question at the AfC Help Desk or get live help from experienced editors.
  • Please do not remove reviewer comments or this notice until the submission is accepted.

Where to get help
  • If you need help editing or submitting your draft, please ask us a question at the AfC Help Desk or get live help from experienced editors. These venues are only for help with editing and the submission process, not to get reviews.
  • If you need feedback on your draft, or if the review is taking a lot of time, you can try asking for help on the talk page of a relevant WikiProject. Some WikiProjects are more active than others so a speedy reply is not guaranteed.
How to improve a draft

You can also browse Misplaced Pages:Featured articles and Misplaced Pages:Good articles to find examples of Misplaced Pages's best writing on topics similar to your proposed article.

Improving your odds of a speedy review

To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags.

Add tags to your draft Editor resources Declined by S0091 6 months ago. Last edited by BD2412 4 seconds ago. Reviewer: Inform author.
ResubmitPlease note that if the issues are not fixed, the draft will be declined again.
Submission declined on 2 March 2024 by MicrobiologyMarcus (talk).This submission is not adequately supported by reliable sources. Reliable sources are required so that information can be verified. If you need help with referencing, please see Referencing for beginners and Citing sources. Declined by MicrobiologyMarcus 10 months ago.
Submission declined on 8 January 2024 by Lewcm (talk).This draft's references do not show that the subject qualifies for a Misplaced Pages article. In summary, the draft needs multiple published sources that are:
Make sure you add references that meet these criteria before resubmitting. Learn about mistakes to avoid when addressing this issue. If no additional references exist, the subject is not suitable for Misplaced Pages. Declined by Lewcm 11 months ago.
  • Comment: still some largely unsourced sections. microbiologyMarcus 12:51, 2 March 2024 (UTC)

Virtual memory compression algorithm

The WKdm algorithm is one of the first in the class of WK virtual memory compression techniques developed initially by Paul R. Wilson and Scott F. Kaplan et al. circa 1999. The "dm" in the WKdm acronym stands for "direct mapped" and refers to the direct mapped hash method used to map uncompressed words in memory to the WKdm algorithm's dictionary.

Motivation

The key insight on which the WKdm algorithm is built is the realization that most high-level programming languages compile to output whose data section(s) have certain very strong data regularities with regard to integers and pointers. Firstly, a large amount of integers and pointers are word-aligned within records, where "word" here will henceforth refer to 32 bits. Additionally, most integers usually contain small values relative to their maximum ranges. For pointers, most proximal in memory to each other reference addresses that are close to each other in memory. Finally, certain data patterns, particularly words of all zeroes, frequently occur and this is exploited by the algorithm.

To make use of the above data regularities, one need only realize that, frequently, words will share many of their high-order bits either because they aren't large enough to require a full-word bit width, or, said words are pointers whose values reference addresses close in memory to those referenced by nearby pointers. Also, words of all zeroes, which occur frequently, can be easily compressed.

Algorithm

Compression

The WKdm algorithm reads one word at a time from an address range, usually a page or pages, and uses a 16-entry direct mapped dictionary of words to produce compressed output which is segregated into four arrays or "segments" which contain, respectively, "tags" ( 2-bit values indicating the type of (non)match ), dictionary indices, unmatched words and the lower 10 bits of partially matched words. The tag, index and partial match values are initially output into bytes or words in their respective segments, before being “packed” after the number of words in the addresses range to be compressed is exhausted.

For each word read, the word is mapped to the dictionary using a direct-mapped hash, and then the type of (non)match is determined. If a full 32-bit word match is found in the dictionary, then a 2-bit "tag" value indicating a full-word match is written to the tags segment and the 4-bit index of the match within the dictionary is written to the indices segment. If only the high-order 22 bits match, then a different tag is written to the tags segment, the dictionary index of the partial match is output to the indices segment and the differing 10 lower-order bits are recorded in the partial match segment. If no match is found then the new value is added to the dictionary, as well as being emitted to the unmatched-words segment, and another tag signaling this is written to the tags segment. If the read word is all zeroes, then only one tag indicating this is output to the tags segment.

After all the words in the address range to be compressed have been read, the tags, indices and 10-bit partial match values, which are stored in bytes or words in their segments, are "packed" within their respective segments ( e.g. their bits are made contiguous if their particular segment is taken to be one large bit vector. Additional steps may be taken — the exact details are implementation specific ) using bitwise operations to further reduce the compressed size of the data.

Decompression

Decompression is quite straightforward. The tags segment is processed one 2-bit tag at a time and action is taken depending on the value of the tag. If the value indicates a full-word match, then the corresponding dictionary index within the indices segment is referenced and the value referenced by the index in the dictionary is output. If a partial match is indicated, then the corresponding entry in the indices segment is consulted to look up the value that matches the high-order 22 bits and then the partial match segment is read to reconstruct the full 32-bit word, which is written to the uncompressed output. If the current tag indicates that there was no match, then the corresponding 32-bit word in the unmatched words segment is referenced and added to the dictionary as well as being emitted as part of the uncompressed output. If the tag indicates that a word was read that was all zeroes, then a 32-bit zero value is sent to the output.

Performance

Tests and real world performance data show that WKdm compression achieves a compression ratio comparable or superior to LZ-based dictionary compressors. The WKdm algorithm also has much less overhead than an LZ-class compressor as it only uses a dictionary that is 64 bytes in size as compared to eg. 64 kilobytes. Furthermore, because of the simplicity of the algorithm, compression and decompression is usually much faster than traditional LZ-based compressors.

Variants

The original authors of the WKdm algorithm also developed the so-called "WK4x4" algorithm. This variation on the algorithm uses a 4-way set associative cache instead of a direct mapped hash for the dictionary. The performance, though, was shown to be the same or worse than WKdm in most cases.

Matthew Simpson et al. developed a variation on the original WKdm algorithm named "WKS" in which the compression is performed in-place without the need of any temporary arrays or "segments." This cuts down on the temporary memory requirements. Also, the algorithm prevents compressed data expansion due to its ability to detect incompressible data. Furthermore, WKS extends what is considered a partial match for a given word.

Notable implementations

WKdm compression has been used by Apple's OSX since version 10.9 Mavericks in 2013 and also in the shared source of the Darwin-XNU open source operating system / kernel. WKdm compression has also been implemented in the OLPC Linux kernel. A new implementation of the Linux virtual-memory manager, called "CCache", has also been demonstrated to work with the WKdm algorithm and its variants on a Nokia Internet Tablet N800 running on a TI OMAP processor.

See also

References

  1. ^ Wilson, Paul R.; Kaplan, Scott F.; Smaragdakis, Yannis (1999-06-06). The Case for Compressed Caching in Virtual Memory Systems (PDF). USENIX Annual Technical Conference. Monterey, California, USA. pp. 101–116.
  2. ^ Simpson, Matthew; Barua, Rajeev; Biswas, Surupa (12 Aug 2023). "Analysis of Compression Algorithms for Program Data" (PDF). University of Maryland: 4–14. Retrieved 6 Jan 2024.
  3. ^ Farias Briglia, Anderson; Bezerra, Allan; Moiseichuk, Leonid; Gupta, Nitin (2007). "Evaluating effects of cache memory compression on embedded systems" (PDF). 2007 Linux Symposium: 56–57. Retrieved 11 Jan 2024.
  4. ^ The explanation of the ARM64 implementation and of the general WKdm compression algorithm is contained in the second block comment. Also, similar explanations can be found in the OSFMK ARM32, x64 and x86 code. "WKdmCompress_4k.s". github.com. 28 Sep 2017. Retrieved 15 Jan 2024.
  5. ^ The explanation of the ARM64 implementation and of the general WKdm decompression algorithm is contained in the second block comment. Also, similar explanations can be found in the OSFMK ARM32, x64 and x86 code. "WKdmDecompress_4k.s". github.com. 28 Sep 2017. Retrieved 15 Jan 2024.
  6. Isaac, Sean; Anderson, Geronimo; Nordby, Jon (19 July 2019). "Cache and Memory Compression Techniques". github.com. Retrieved 3 Mar 2024.
  7. Larabel, Michael (2 Jun 2017). "LZ4m: Taking LZ4 Compression To The Next Level". phoronix.com. Retrieved 3 Mar 2024. But LZ4m does come up short of the WKdm page compressor's compression ratio and compression speed.
  8. Chihaia Tuduce, Irina; Gross, Thomas (2005). "Adaptive Main Memory Compression" (PDF). USENIX Annual Technical Conference: 237–250. Retrieved 5 Mar 2024. For all experiments, we use the WKdm compression algorithm as it shows superior performance over other algorithms
  9. "OS X 10.9 Mavericks: The Ars Technica Review". 22 October 2013.
  10. "OS X Mavericks Core Technologies Overview October 2013" (PDF). apple.com. Oct 2013. p. 7. Retrieved 9 Jan 2024.
  11. "OS X El Capitan Core Technologies Overview September 2015" (PDF). apple.com. Sep 2015. p. 7. Retrieved 9 Jan 2024.
  12. Kessler, Topher (10 Jun 2013). "Memory compression brings RAM Doubler to OS X Mavericks". cnet.com. Retrieved 3 March 2024.
  13. G. Richard III, Golden; Case, Andrew (17 Jul 2014). "In lieu of swap: Analyzing compressed RAM in Mac OS X and Linux". Digital Investigation. 11: S7 – S8. doi:10.1016/j.diin.2014.05.011. Retrieved 11 Jan 2024.
  14. Beltran, Vicenc¸; Martorell, Xavier; Torres, Jordi; Ayguad´e, Eduard (2008). "Accelerating Software Memory Compression on the Cell/B.E." (PDF). First Workshop on Hardware Accelerators for High-Performance Computing: 2. Retrieved 11 Jan 2024.
  15. Gupta, Nitin (7 Jul 2006). "[linux-mm-cc] Compression structure implementation". marc.info. Retrieved 9 Jan 2024.
Categories: