Machine Code Instructions

27    10 Aug 2015 02:54 by u/roznak

8 comments

3

so sloowwwwwww

2

I hope that I somehow make people curious and don't get scared of the assembler language. It is not that complicated and scarey at all. It is C that makes it appear more complicated and scarey.

Understand how the assembler is created by your C function or even C# function an you will create faster and more optimal code.

1

Well that and to actually use assembly you kinda have to know a bunch of things that C automagically does right. In particular (on unixy systems) you have to understand some stuff about how ELF works, calling conventions, system call conventions, etc.

I'd warn against thinking you can create more efficient assembly than a compiler though. Compilers at -O3 tend to produce very very efficient code as they're not bound by the limits of human sanity.

Also, for the people reading this who don't have as much experience, note that C# compiles to a bytecode (Common Intermediate Language), not to machine code. CIL, while looking assemblyish is actually quite a bit higher level since it has for example a built-in notion of objects. On the plus side CIL does have plenty of good documentation and is relatively readable so it's definitely a good place to start. For a (somewhat) similar assembly-like language closer "to the metal" I'd recommend LLVM IR.

2

Compilers at -O3 tend to produce very very efficient code as they're not bound by the limits of human sanity.

But only when it can. Do not blindly trust the -O3 generated assembler, you really must test it.

Sometimes reordering your function and variables can speed things up. E.g. something I have learned is that your function must be written in such a way that the code that is used most of the time is in front of the code that will rarely be called. You help the C compiler to avoid to have a jump in the code that should run most of the time. A jump in the assembler code stalls the execution because it has to reload a new cache. The optimizer can optimize the code that was written but not optimize for what is intended to do.

if x then A() else B() could be slower than if (not x) then B() else A();

Sometimes to gain speed you could have the 2 versions, one optimized for A() and one optimized for B().

note that C# compiles to a bytecode (Common Intermediate Language), not to machine code. CIL,

Don't let the IL code fool you, when you look how the IL code gets translated into real machine code (assembler) then it is very close.

Voat is not very friendly to write code examples, and I do not have lots of time to write code samples. I just hope that it triggers the love for assembly back and that people dare to look under the hood of what the C compiler creates.

0

But only when it can. Do not blindly trust the -O3 generated assembler, you really must test it.

I agree, optimization can make code less efficient in odd cases. When improving performance one should always measure, measure and measure.

E.g. something I have learned is that your function must be written in such a way that the code that is used most of the time is in front of the code that will rarely be called.

What you describe sounds like the effects of branch misprediction and can be mitigated both by the compiler and the processor. I know that LLVM could move always executed code to the front of the function, or at least it has all the information needed to know if it's safe to do so. Modern processors try to do branch prediction where they optimistically work on instructions of the branch they think code will take. If somehow that process gets messed up you indeed lose performance.

Don't let the IL code fool you, when you look how the IL code gets translated into real machine code (assembler) then it is very close.

Yeah, I would expect that. The main difference (aside from the whole OO support) seems to be that processors have a few registers which are super fast to access for intermediate results while CIL uses a stack for that.

Come to think of it, there's another reason why writing assembly is sometimes useful. New processors sometimes have special instructions for very specific tasks and to use those it can be unavoidable to write assembly yourself (or use the assembly someone else has written). For example code that does a lot of matrix calculations can benefit from the use of SSE but I don't believe compilers often take advantage of that.

0

MMX, SSE and SSE2 was never used when I was developing in C++ back then. Pre VS2005!!!

When I started with C# I barely went back to C++.

1

I'd like to add that the bits in the operand can often be considered as bitfields for which broad range of instructions to use.
For instance, it could easily be that all the "load accumulator" instructions must have the 6th bit set and 7th bit clear, with the remainder of the bits specifying whether it's a memory operand, another register or an immediate.

0

I see an expert here. :-)