Christian Lawson-Perfect @christianp

Recent searches

Search options

Only available when logged in.

TIL that modern CPUs have an $F_2$-polynomial multiplication intrinsic operation: https://en.wikipedia.org/wiki/CLMUL_instruction_set

en.wikipedia.orgCLMUL instruction set - Wikipedia

Dec 01, 2024, 05:32 AM··Web

6boosts·14favorites

**Per Vognsen** @pervognsen@mastodon.social · Dec 1, 2024 *

Dec 1, 2024 *

Per Vognsen @pervognsen@mastodon.social

@j2kun Yeah, it's surprisingly useful. Aside from the classic "algebraic" use cases, there are some often useful bit tricks like computing the running bit parity by carryless multiplying by all ones/-1.

**Per Vognsen** @pervognsen@mastodon.social · Dec 1, 2024 *

Dec 1, 2024 *

Per Vognsen @pervognsen@mastodon.social

@j2kun For example, if you mark the start and end of a range with a 1 bit then the running parity is a mask vector to select the bits in those ranges. You can even use this for computing rasterization coverage masks for potentially overlapping polygons where overlaps are resolved with the "mod 2" rule.

**Per Vognsen** @pervognsen@mastodon.social · Dec 1, 2024

Dec 1, 2024

Per Vognsen @pervognsen@mastodon.social

@j2kun And here's a fun application to parsing quoted strings: https://github.com/simdjson/simdjson/blob/cab383e1de7385c6460b66e5fad25a116d750402/src/generic/stage1/json_string_scanner.h#L67

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks - simdjson/simdjson

GitHubsimdjson/src/generic/stage1/json_string_scanner.h at cab383e1de7385c6460b66e5fad25a116d750402 · simdjson/simdjsonParsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks - simdjson/simdjson

**Marc B. Reynolds** @mbr@mastodon.gamedev.place · Dec 3, 2024

Dec 3, 2024

Marc B. Reynolds @mbr@mastodon.gamedev.place

@pervognsen @j2kun speeding up multi-block CRC32C is another example:

https://www.corsix.org/content/fast-crc32c-4k

www.corsix.orgFaster CRC32-C on x86

**Janne Moren** @jannem@fosstodon.org · Dec 1, 2024

Dec 1, 2024

Janne Moren @jannem@fosstodon.org

@j2kun
A bit of an aside, but when you have instructions named PCLMULQDQ, PCLMULLQLQDQ, PCLMULHQLQDQ, PCLMULLQHQDQ and PCLMULHQHQDQ, I'm starting to question the use of the term "mnemonics" for assembler instruction names.