Search found 178 matches

by agner
2021-06-28, 11:34:58
Forum: forwardcom forum
Topic: Universal boolean instruction
Replies: 5
Views: 16766

Re: Universal boolean instruction

Thank you for your comment. I was not aware of MRISC32. It looks like we have got some of the same ideas. ForwardCom allows instructions to be up to three 32-bit words. This makes it possible to overcome the problem of cramming a lot of information into a single code word, that most RISC designs suf...
by agner
2021-06-28, 6:33:47
Forum: forwardcom forum
Topic: Universal boolean instruction
Replies: 5
Views: 16766

Re: Universal boolean instruction

Now I have tried to synthesize it in an FPGA. The solution with a complete truth table is using 18% more slices and 40% more LUTs than the bitwise_logic circuit I described above. The truth table implementation is using less resources than I expected because the FPGA has efficient ways of implementi...
by agner
2021-06-28, 6:16:26
Forum: forwardcom forum
Topic: input/output instructions
Replies: 8
Views: 23015

Re: input/output instructions

Not 100%. Hubert has fertilized it with valuable insight in hardware design :-)
by agner
2021-06-27, 7:52:00
Forum: forwardcom forum
Topic: Universal boolean instruction
Replies: 5
Views: 16766

Universal boolean instruction

I have an idea for a multi-purpose 3-input bitwise boolean instruction. This instruction can implement all functions of the type RESULT = (A AND/OR B) AND/OR/XOR C with optional inversion on all inputs and outputs. It can also do A XOR B XOR C and bit selection: A AND B OR NOT A AND C . The latter c...
by agner
2021-06-25, 5:11:33
Forum: forwardcom forum
Topic: Macro-op fusion as an intentional instruction set design choice
Replies: 4
Views: 17484

Re: Macro-op fusion as an intentional instruction set design choice

Hubert, my main motivation for making load+alu instructions is to do more work per instruction = higher throughput. The x86 instruction set is quite efficient despite a terribly complicated decode process, exactly because it does more work per instruction. This is also reducing the register load. An...
by agner
2021-06-19, 5:32:29
Forum: forwardcom forum
Topic: Default integer size 32 or 64 bits?
Replies: 7
Views: 20366

Re: Default integer size 32 or 64 bits?

Power consumption is also an issue. 32 bit integers use less power. My soft core can run faster when 64-bit integers are not implemented. While writing C++, I eventually switched to size_t size_t is unsigned, so it would not fit the short version loop instructions. The corresponding signed type in C...
by agner
2021-06-18, 17:09:31
Forum: forwardcom forum
Topic: Default integer size 32 or 64 bits?
Replies: 7
Views: 20366

Re: Default integer size 32 or 64 bits?

Most integer instructions are available in both 8, 16, 32, and 64 bits versions. The question is only which one to prioritize for short instructions (single code word). The forthcoming version (1.11) will have both 32 and 64 bit short versions of some instructions. Most branch and loop instructions ...
by agner
2021-06-17, 18:44:53
Forum: forwardcom forum
Topic: Load From Const Array Instruction
Replies: 3
Views: 7283

Re: Load From Const Array Instruction

The solution that Hubert proposes is already possible with the present design. ForwardCom supports a separate read-only data section addressed relative to IP. An addressing mode with [IP + offset + scaled index] is also supported. I don't remember if we have discussed this before, but it is certainl...
by agner
2021-06-10, 18:32:36
Forum: forwardcom forum
Topic: Load From Const Array Instruction
Replies: 3
Views: 7283

Re: Load From Const Array Instruction

Thank you for your suggestion. A problem with your proposal is that jumps are costly, especially if the pipeline is long, because they interrupt the prefetching and decoding of instructions. Another problem is that the table needs multiple copies if it is accessed from multiple points in the code. I...
by agner
2021-04-13, 5:28:33
Forum: forwardcom forum
Topic: Macro-op fusion as an intentional instruction set design choice
Replies: 4
Views: 17484

Re: Macro-op fusion as an intentional instruction set design choice

The x86 instruction set introduced prefixes long ago. Today, there is a lot of different prefixes that are 1, 2, 3, and 4 bytes long. There is no limit to how many prefixes an x86 instruction can have as long as the complete instruction is no more than 15 bytes long. This is a nightmare to decode. T...
by agner
2021-03-22, 9:58:14
Forum: forwardcom forum
Topic: Default integer size 32 or 64 bits?
Replies: 7
Views: 20366

Default integer size 32 or 64 bits?

Some ForwardCom instructions are available in a short form using format template C. Template C has one register field, 16 bits of immediate data, and no operand size field. This will fit an instruction like for example int r1 += 1000 I am in doubt whether the integer size should be 32 bits or 64 bit...
by agner
2021-03-19, 16:44:14
Forum: forwardcom forum
Topic: Rollbackable L1 Data Cache Design?
Replies: 7
Views: 12742

Re: Rollbackable L1 Data Cache Design?

Functions that receive a pointer can check if it is aligned. Functions like memcpy do that. But it is unrealistic to require that all functions have multiple paths for aligned and unaligned pointers. In most situations you can require that pointers be aligned according to the data size. Alignment of...
by agner
2021-03-15, 7:15:08
Forum: forwardcom forum
Topic: Rollbackable L1 Data Cache Design?
Replies: 7
Views: 12742

Re: Rollbackable L1 Data Cache Design?

In most cases, the compiler will know whether memory is aligned or not. Standard functions like memcpy are checking whether the pointers are aligned before it decides which method is optimal. Shifting data to make it aligned can be done in software. This may be inconvenient for the programmer, but i...
by agner
2021-03-13, 7:17:19
Forum: forwardcom forum
Topic: Rollbackable L1 Data Cache Design?
Replies: 7
Views: 12742

Re: Rollbackable L1 Data Cache Design?

Thank you Hubert for the explanation. It is a relevant discussion how caching can be made simpler. The ForwardCom design may have restrictions on alignment. Unaligned memory accesses could be split into two, or simply not allowed. The memcpy library function would need to shift data if source and de...
by agner
2021-03-02, 8:01:23
Forum: forwardcom forum
Topic: Implications of ForwardCom memory management approach
Replies: 15
Views: 26832

Re: Implications of ForwardCom memory management approach

When I google for "shared virtual address model" I get something with CPU and GPU sharing the same virtual addresses, but still with fixed-size pages. I think there is little need for a GPU when the CPU has long vectors. But address translation after the cache may be a very good idea. I wo...
by agner
2021-01-29, 6:51:07
Forum: forwardcom forum
Topic: Using CPU cores as GPU
Replies: 3
Views: 6308

Re: Using CPU cores as GPU

Thank you for the reference. I am not an expert in graphics processing, but I think that ForwardCom is well suited for adding graphics processing instructions. The fundamental data types and elementary instructions proposed by Pixlica are available in ForwardCom. ForwardCom has the further advantage...
by agner
2020-11-21, 6:59:12
Forum: forwardcom forum
Topic: input/output instructions
Replies: 8
Views: 23015

Re: input/output instructions

Hubert Lamontagne, I am not sure you would want to have a a large graphics coprocessor. I would prefer to have a multicore ForwardCom processor with large vectors rather than a smaller ForwardCom processor connected to a graphics coprocessor using a different instruction set.
by agner
2020-11-03, 17:09:55
Forum: forwardcom forum
Topic: input/output instructions
Replies: 8
Views: 23015

input/output instructions

I prefer to have separate instructions for input and output, and a separate address space for in/out instead of memory mapped in/out. The reasons are: All memory addresses are relative and position-independent, while input/output addresses are usually absolute addresses. Code and data memory has a 6...
by agner
2020-09-22, 4:29:15
Forum: forwardcom forum
Topic: Store pair instruction
Replies: 2
Views: 6316

Re: Store pair instruction

The push and pop instructions can write or read a sequence of registers with increment or decrement of the stack pointer or an arbitrary pointer register. It will typically take one clock cycle for each register plus an extra clock cycle in the end for updating the pointer. Support for these instruc...
by agner
2020-08-09, 4:54:59
Forum: forwardcom forum
Topic: Implications of ForwardCom memory management approach
Replies: 15
Views: 26832

Re: Implications of ForwardCom memory management approach

Sebastian, I am not sure I understand your idea. Why do you think memory protection is critical for the speed, but address translation is not?
by agner
2020-06-25, 6:13:38
Forum: forwardcom forum
Topic: Implications of ForwardCom memory management approach
Replies: 15
Views: 26832

Re: Implications of ForwardCom memory management approach

You may have to emulate fixed size page tables to run Linux etc. Or you may want to invent something new. ForwardCom can make memory blocks that are executable but not readable to make a true Harvard architecture, and memory blocks that are readable but not executable for security reasons. It can al...
by agner
2020-06-15, 13:09:58
Forum: forwardcom forum
Topic: Implications of ForwardCom memory management approach
Replies: 15
Views: 26832

Re: Implications of ForwardCom memory management approach

A TLB lookup may take 1-2 clock cycles. A TLB miss costs many clock cycles. The chip area used for the TLB could be used for making a larger data cache or code cache instead. You can't have unlimited chip area without slowing down everything.
by agner
2020-06-10, 5:00:40
Forum: forwardcom forum
Topic: Proposal to drop tiny instructions
Replies: 11
Views: 29639

Re: Proposal to drop tiny instructions

The x86 instruction set has many complex instructions with multiple output registers and multiple µops. Such complex instructions are implemented as microcode in both Intel and AMD processors. They are quite slow because the normal pipeline flow is interrupted while code is being fetched from a micr...
by agner
2020-06-05, 5:11:35
Forum: forwardcom forum
Topic: Proposal to drop tiny instructions
Replies: 11
Views: 29639

Re: Proposal to drop tiny instructions

Regarding multi-register instructions: We should not have too many such instructions, because they are complicated to implement in the decoder. I have thought about instructions for zeroing many registers or clearing many vector registers. If you need a multi-register move, you could probably use a ...
by agner
2020-05-21, 18:16:44
Forum: forwardcom forum
Topic: Proposal to drop tiny instructions
Replies: 11
Views: 29639

Re: Proposal to drop tiny instructions

The decision has been made now. I have removed the tiny instructions and updated all documentation and tools according to this change. New push and pop instructions are added. This change allowed me to get rid of a lot of complexity.

This is version 1.10.