MrSID is one of the few image formats I don't have support for. A few applications are capable of reading it and very few are capable of writing it.
One of the key features to the MrSID format is that it is lossy. When you write this file to another lossy format (even one based on wavelet compression), there is a good chance you'll add new artifacts. For this reason, I would recommend converting the files to PNG to work with the IMC.
As far as size goes, image sizes are a bit misleading. This is because when a compressed file is read in, it must be stored in a decompressed form for modification and quick read access. In most cases, this is 3 or 4 bytes per pixel depending on whether you are storing an alpha channel. Sometimes this is 1 byte per pixel for stuff with a color dictionary like the GIF format and sometimes this is larger for stuff that stores float precision (RGB96/RGBA128). Generally, it's recommended to use something like 3000 by 3000 dimensions although a few users have decided to test it with images that are roughly 100k by 100k dimensions. With the IMC v0.7.3.0, there is a good chance you'll get a crash when processing an extremely large dataset. You should either process the raster with IMC v0.6.6.0 or wait for IMC v0.7.4.0 to be released.
I would recommend running the entire thing together and using a shapefile to define the bounds. It would be best if you don't downsample the images too many times when creating raster levels. Usually, a good multiplier is to generate new down sampled levels every 3 or 4 binary levels (Ex: 1, 8, 64, 256, etc.). Doing it this way, the IMC must parse the entire image collection to determine which images are required for each tile. Unfortunately, there is no support for extracting the width/height without reading the entire image file in the image library I use (or, to my knowledge, any image library out there) so this is rather slow.
The IMC has limited support for the TIFF format. The imagery it reads fine but the GeoTags are completely ignored. This is because GeoTIFFs are georeferenced with tie points and it's difficult to create merged layers based off multiple images using this method. It's best to have a consistent pixel representation and it is required to have no rotation in the actual image for the IMC to work.
Contours, as well as any vector, can sit on top of the raster. Polygons require the render over raster to be set as they default to not displaying on top of raster.