From Msim

Revision as of 18:44, 1 February 2010 by Jloew (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This page refers to the changes made in the 3.0.14 release of M-sim.

Release Date: 02/01/2010


Spec Mode Problem

Spec mode problem fixed (01/06/2010 updated version is online). A context that was not in spec mode could have a NULL/invalid ROB pointer and cause the simulator to transition into spec mode.

In sim-outorder.c, register_rename(...)
     //if this is a branching instruction update BTB, i.e., only non-speculative state is committed into the BTB
should be:
 if(!contexts[disp_context_id].spec_mode && rs)

Global Changes

HOST_HAS_QWORD checks have been largely removed. QWORD support is essentially mandatory for various functionality to work.
setjmp and longjmp support has been removed. It is no longer needed. Checks are made in sim-outorder to determine if there are no threads left. Execution should abort normally through sim_main(...).
Various objects no longer use the built-in statistics package (some issues with addresses of variables changing led to this) and now output their own statistics. Cleanup of the output will be done in a later release.

Branch Predictor

All predictors have their own files. They now inherit from the parent predictor type.
bpreds.h controls the inclusion of the branch predictors. Removing them from here will cause them not to be included in compilation (be aware that make clean may be needed first, the makefile is in dire need of fixing).

Branch Target Buffer (BTB)

No longer requires that size and associativity be non-zero (if one is zero, the other must be too). This was required for the combining predictor.
md_opcode op is no longer used (it didn't appear to be used anywhere anyway).
A code rewrite is embedded in btb.c and btb.h but not enabled by default. It has not finished testing at this point.

Return Address Stack (retstack, RAS)

Added a method, clear(), to empty the retstack. This was added for execve support.


The use of the hash table is disabled by default. define USE_HASH in cache.h to enable this.
The use of the statistics handler is no longer implemented. Instead, the cache can dump its stats directly.
A model for bus contention times has been added. It is disabled by default (define BUS_CONTENTION in cache.h to enable this). This model was disabled since it slows down fast forwarding dramatically.
last_tagset and last_blk (for fast cache repeat hit handling has been removed - unless there were some compiler optimizations going on under the hood, this was not providing an advantage).
context_id for a block is defaultly set to -1, otherwise, a hit would be falsely detected if the addr generated the default tag for a cache block.

Cores (core_t)

The use of the statistics handler is no longer implemented. Instead, the core can dump its stats directly.
A copy constructor for core_t has been provided, it will be revised in the future and may not be safe to use once the simulation has been initialized (it is fine during the resize of the std::vector<core_t> cores).
icount checks have been changed since the data type is unsigned.

EIO Handler

eio_read_chkpt now closes the input file.


md_fault_type now has md_fault_segfault that detects out-of-bounds usage of memory mapped files (shared only).


unaligned quadword loads and stores are now supported (when on 4 byte boundaries). This is a common problem and would cause legitimate code to fail. Other mis-alignments are not supported (and are usually a more significant problem).


exit_code is no longer needed (setjmp/longjmp removed)


sys_syscall no longer takes a memory accessor function pointer.
mmap is no longer handled in this file, it is handled by memory.c and memory.h.
Many of the classes defined for use with alpha datatypes have been given initializers to zero out the object.
class osf_statfs now more closely resembles its Tru64 representation.
some support for tbl_procinfo has been added.
A get_filename(...) method is provided, this accesses simulated memory and provides the filename as type std::string. Fixes can be applied here (such as converting "//bwaves.in" to "bwaves.in", or checking if the require is a system file and redirecting to the appropriate filename).
sys_output(...), with semantics similar to printf, has been provided. It is disabled by default (define SYS_DEBUG in syscall.c to enable). This will provide console output when syscalls are called - often providing some information about what is going on. For bug reports, running the simulator with this flag enabled and providing that output is extremely helpful in terms of tracking down bugs.
The macro: "#define arg(X) regs->regs_R[MD_REG_##X]" has been provided. Register access is extremely frequent in syscall.c and this reduces the excessive amount of text generated but in a reasonably readable manner.
Additional system calls now have there file descriptors checked in case they need redirection. (mmap needs this as well but does not use the expected argument register so this is done as part of mmap).
osf_sys_exit and osf_sys_exit_group write their return values into the pid handler, eject themselves from the core, and close all of their open files.
osf_sys_open catches attempts to open /dev/tty and returns the file descriptor for stdout.
osf_sys_pipe now works correctly - these file descriptors can not be recovered in a checkpoint yet.
osf_sys_ioctl::tiocisatty returns true if the file descriptor is 1 or 2 (or if it is redirected to 1 or 2).
osf_sys_kill now kills a thread. The signals are not used, just reported. The victim is flushed and then ejected. The pid handler receives the signal as a return value.
osf_sys_umask was using the wrong result register, it now uses V0.
osf_tbl_procinfo is somewhat supported.
osf_sys_dup is now supported.
osf_sys_dup2 is now supported.
osf_sys_fcntl is probably less supported than before. Support for F_GETFD/F_SETFD is needed. Others, such as F_DUPFD is probably translated to dup anyway.
osf_sys_setrlimit stack requests are limited to 0x2000000
osf_sys_getsysinfo is now supported for GSI_CLK_TCK (returns 1 Ghz)
osf_sys_select was not supported properly in terms of timeout. The current implementation assumes infinite timeout but allows the remainder of the simulation to continue (essentially acting as non-blocking).
osf_sys_madvise no longer does anything when MADV_DONTNEED is used (it shouldn't have in the first place). MADV_SPACEAVAIL is still unsupported.
osf_sys_mprotect is now partially supported.
osf_sys_msync is partially supported. We may need to handle the case where msync is called on shared mappings (which may be the only case anyway...).
osf_sys_set_program_attributes is now supported (required for the dynamic loader)
osf_sys_readlink is now supported.
osf_sys_execve is now supported. Clears register UNIQ, clears allocated memory (except pages marked by MAP_INHERIT).
osf_sys_wait4 is now supported.
osf_sys_munmap is now properly supported.
osf_sys_fork is supported. However, the contexts must be made available in advance (using -max_contexts_per_core). forked contexts may not be placed on the same core. Code for forking to a newly created core is not complete but retained in the release.


The statistics handled here really belong to the memory code. It has been removed.

  • no longer requires an eio_fd.
  • detects dynamic executables and attempts to reload the program using loader. File path will be moved to a configuration file in the future.
  • dynamic loading sets values within a context that allow us to fast-forward past the initial loading of the executable (we do not skip the fini sections).
  • argv and envp variables are quadword aligned now. envp and argv are zeroed out (if we write "a", the next 7 bytes (quadword aligned) are set to 0). This did not appear necessary but makes the trace more accurate compared to the native machine.

PID Handler

A process id handler has been provided. This adds proper support for OSF_SYS_wait4. The PID handler retains return values for processes that terminate and allow them to be recovered - or test for their existence (to avoid having to have the thread continually enter the pipeline). Various places required updates for this:

smt.h/smt.c:    contexts now retain their own pid
sim-outorder.c: context initialization acquires a pid
                pids are used to check if a context is still alive
syscall.c:      OSF_SYS_exit now clears a pid on termination, stores return values
                OSF_SYS_kill is able to kill a context, nothing is done about return values at this point
                OSF_SYS_wait4 can check for a particular child process (or -1 for any child) as well as acquire the proper return value
                OSF_SYS_fork acquires pids for the child as well as adds the child pid to the parent's pid list

Memory Mapping

Memory mapping has been moved out of syscall.c. Memory mapping is largely supported at this time (FIXED, PRIVATE, ANON, SHARED, location hints, etc). FIXED mappings are slightly hacked together. If a FIXED mapping overlaps an old mapping, it is suppose to deallocate the prior pages and replace them with itself. We do not handle the deallocate but itself place the mapping at the front of the mapping list (ensuring it is accessed first). If this mapping is unmapped, the old mapping would then be accessed - no segmentation fault would occur (should it?). Since there is no real mapping applied in the FIXED case, this probably is ok.


Only shared memory mappings are actually implemented, the rest are faked (since the results aren't preserved, this is fine).
A direct memory access handler is provided (mem_access_direct) which ignores alignment faults - is used for rollback after a branch misprediction.
Memory can dump its stats directly instead of having it done by loader_t through the statistics handler.


mystrdup removed.


The variable, pseq, used for pipetrace sequence numbers is now a long long instead of an unsigned int (affects ptrace.[ch] and rob.[ch]).


DLite has been rewritten as a class/struct and is provided for each class. osf_sys_fork rebuilds DLite for the parent thread (fixes pointers) and provides the child its own DLite.
DLite option "-i" is currently disabled (the global variable it used no longer exists). It must be reassociated with DLite for the first context but the context doesn't exist at that point. This will be corrected in a future revision.

Contexts (smt.[ch])

eio_fd (eio trace handle) is removed
context::interrupts has been added. This allows us to detect special phases that a context may be in (dynamically loading, waiting for syscall wait, waiting for syscall select, fix pipeline due to syscall execve)
Other member variables have been added to support these phases.
child_pids is no longer stored here. The pid handler takes care of this.

File Table

Pipes are now handled (somewhat) in the file tables (sockets are still not well supported). Pipes are identified using "FE_PIPE" in file_table.h.

#define FE_PIPE				0x00010

Additional functionality has been added: copy_from(...) allows a file table to be copied from one context to another - this is needed for forking.

		void copy_from(const file_table_t & rhs);

dup2(...) is used to implement the dup2 system call.

		md_gpr_t dup2(unsigned int handle_old, unsigned int handle_new);

insert(...) adds a file descriptor (using its real value) into the file table with an associated name. This is used with pipe to give the file descriptors to a context as well as redirection.

		void insert(md_gpr_t fd, std::string name);

closeall() is used when a process terminates to close all of its open files.

		void closeall();

Fixed a bug in file_table_t::opener(...). simulated_fd should not be set to temp, this could cause duplicate simulated file descriptors to occur (this should not be a problem in a single threaded, non-forking simulation).
Preliminary support for PIPE checkpointing has begun
Provided dup (as duper(...)) support.
Provided lowest_avail_sim_fd(...) to acquire available file descriptor (simulated) values.
Provided closeall(...) to force closing of files as explicit destruction
Closing of standard i/o is only allowed if redirection is used. No context may kill the real stdin/stdout/stderr descriptors.


ejected_contexts has been added. This is to store contexts that have been killed/exited such that they stop using up resources. This is preliminary support.
ff_mode_t has been added. This indicates to the fast forwarding logic if it should be warming up caches or not.
MAX_IDEPS has been removed.
print_power_stats is now false by default.
default decode/issue/commit widths are now 8. The dl1 cache is now 64KB, dl2 is now 512KB, il2 is now 64KB. The return address stack is now 16 entries.
sim_uninit(...) now calls statistics printers for various objects.
The commit timeout handler has been adjusted to not falsely trigger if a thread is waiting on syscall wait or syscall select.
register_rename(...) now skips contexts that are waiting on syscall wait or syscall select.
In register_rename(...), last_fetch_redirected was checked too early. It is now checked after the generation point and no longer wastes an iteration of the rename loop.
In register_rename(...), the rollback logic has been adjusted to use mem_access_direct.
In register_rename(...), the syscall execve, can stop renaming for the context (as well as clear the instruction fetch queue).
fetch(...) ignores threads that do not have a pid.
fetch(...) detects threads that have begun dynamic loading and allows them to fast-forward to their entry point (without warming up the caches).
Instruction fetches are checked to see if the memory page they are associated with exists. Instruction fetches that are below the text_base (but greater than 0x1000 - since misspeculatve addresses starting at 0x1 do occur normally) are checked as well.
max_ff_left(...) has been removed. We loop from [0,fastfwd_amt) and give each thread a chance to fast forward a single instruction. These threads may be dependent on each other, therefore, we can't fast-forward them one at a time.
ff_context(...) now handles fast-forwarding. Fast-forwarding checks for special events, such as syscall wait or select, and prevents them from hindering the simulation.

Personal tools