My design will at least consist of the following:
My intention is to execute one instruction per cycle, unless I find out later that it is not technical infeasible to do so.
At the very least, there must be addition and subtraction instructions. Direct jump and conditional jump instructions would be very nice for any useful program flow control. Loading and storing values from and to the memory would also be essential.
Having a stack to hold return addresses would really be nice to simplify jumps. Using a return instruction to go back to the return address would be easier than keeping track of the return address when writing the program. However, this would add complexity to system, so it probably won't be implemented. Maybe a return address register?
A magnitude comparator can be used to compare values to find out which one is larger or if the values are equal. This can be used for another jump mode. Perhaps the complexity will increase by addition of a comparison result register, be it a 1-bit (for either "less than" or "greater than or equal" states) or 2-bit (for all three possible states) register.
Having a total of 8 bits available for opcode and operand brings some fairly tough constraints. Giving 3 bits for the opcode will allow me to have 8 (2^3) instructions, however the memory addressing will be limited to 32 locations.
Another option is to have each instruction be 16-hits long, with the first ("high" order) be the instruction byte and the second ("low" order) be the operand. This would allow a total of 256 instructions and the operand to hold 256 distinct values, allowing memory addressing of 256 locations. However, this will require 2 cycles to fetch each instruction with a 8-bit data bus and will increase complexity when holding the instruction in a 16-bit instruction register.