Several transformations, in particular AntiTaintAnalysis, use implicit flow as a basic building block. Before you can use these you need to call the --Transform=InitImplicitFlow and list which of the implicit flow variants you are going to use later. Here's an example of what it might look like:
tigress --Seed=0 --Verbosity=1 --Environment=x86_64:Darwin:Gcc:4.6 \ --Transform=InitEntropy --Functions=main \ --Transform=InitImplicitFlow --Functions=main \ --InitImplicitFlowKinds=trivial_thread_1,trivial_counter,mem_cache_time,mem_cache_thread_1,file_cache_time,file_cache_thread_1,jit_time \ --InitImplicitFlowHandlerCount=1 \ --InitImplicitFlowJitCount=1 \ --InitImplicitFlowJitFunctionBody="(for (if (bb 50) (bb 50)))" \ --InitImplicitFlowTrace=false \ --InitImplicitFlowTrain=false \ --InitImplicitFlowTime=false \ --InitImplicitFlowTrainingTimesClock=500 \ --InitImplicitFlowTrainingTimesThread=500 \ --InitImplicitFlowTrainingMinGap=90 \ --InitImplicitFlowTrainingConfidenceLevel=0.99 \ --InitImplicitFlowTrainingTargetErrorRate=0.00001 \ --InitImplicitFlowTrainingKind=statistics \ --Transform=AntiTaintAnalysis --Functions=main \ --AntiTaintAnalysisKinds=vars \ --LocalVariables=main:b \ --AntiTaintAnalysisImplicitFlow="(repeat mem_cache_time 3)" \ input.c --out=output.c
Here, we specify that we might want to use the implicit flow variants trivial_thread_1,trivial_counter,mem_cache_time,mem_cache_thread_1,file_cache_time,file_cache_thread_1,jit_time. We then insert implicit flow using (repeat mem_cache_time 3) after any assignment to the variable b in function main. The transformation of the assignment b = a would look something like this:
exp_orig3 = a; exp_orig3_origPtr4 = (unsigned char *)(& exp_orig3); exp_orig3_copyPtr6 = (unsigned char *)(& exp_orig3_copy5); size_iter7 = 0; while (size_iter7 < 4) { maxCount34 = 0; while (maxCount34 < 1) { majority_iter9 = 0; while (majority_iter9 < 255) { counters8[majority_iter9] = 0; majority_iter9 ++; } repeat_iter32 = 0; while (repeat_iter32 < 3) { charCopy10 = 0; copy_iter11 = 0; while (copy_iter11 < 8) { if ((*exp_orig3_origPtr4 >> copy_iter11) & 1) { pagesize13 = getpagesize(); memalignRet14 = posix_memalign(& __buffer_slow_1, pagesize13, 64); __asm__ volatile ("cpuid\n" "rdtsc\n": "=a" (low16), "=d" (high17)); sum20 = _1entropy ^ 1; iter21 = 0; while (iter21 < _2_time_implicitFlow_memCache_numberOfReads) { __asm__ volatile ("mfence\n" "clflush (%0)\n": : "r" (__buffer_slow_1)); sum20 += *((long *)__buffer_slow_1); *((long *)__buffer_slow_1) = sum20; iter21 ++; } _1entropy = sum20 - _1entropy; __asm__ volatile ("cpuid\n" "rdtsc\n": "=a" (low18), "=d" (high19)); free(__buffer_slow_1); time12 = (((unsigned long )high19 << 32) | (unsigned long )low18) - (((unsigned long )high17 << 32) | (unsigned long )low16); } else { pagesize22 = getpagesize(); memalignRet23 = posix_memalign(& __buffer_fast_1, pagesize22, 64); memalignRet23 = posix_memalign(& __buffer_fast_2, pagesize22, 64); __asm__ volatile ("cpuid\n" "rdtsc\n": "=a" (low26), "=d" (high27)); sum30 = _1entropy | 2; iter31 = 0; while (iter31 < _2_time_implicitFlow_memCache_numberOfReads) { __asm__ volatile ("mfence\n" "clflush (%0)\n": : "r" (__buffer_fast_1)); sum30 += *((long *)__buffer_fast_2); *((long *)__buffer_fast_2) = sum30; iter31 ++; } _1entropy = sum30 | _1entropy; __asm__ volatile ("cpuid\n" "rdtsc\n": "=a" (low28), "=d" (high29)); free(__buffer_fast_1); free(__buffer_fast_2); time12 = (((unsigned long )high29 << 32) | (unsigned long )low28) - (((unsigned long )high27 << 32) | (unsigned long )low26); } if ((double )time12 > _2_time_implicitFlowTimerClock_mem_cache_time_timingCounter_middleReferenceTime) { charCopy10 |= 1 << copy_iter11; } else { } copy_iter11 ++; } (counters8[charCopy10]) ++; repeat_iter32 ++; } maxCount34 = 0; max_iter35 = 0; while (max_iter35 < 255) { if (counters8[max_iter35] > maxCount34) { maxCount34 = counters8[max_iter35]; winner33 = max_iter35; } else { } max_iter35 ++; } } *exp_orig3_copyPtr6 = winner33; exp_orig3_origPtr4 ++; exp_orig3_copyPtr6 ++; size_iter7 ++; } b = exp_orig3_copy5;
Implicit Flow Kinds
Some primitives are based on the timing differences between a slow and a fast process:void P'() { value=0; for(i in [0...(bits in b)-1]) { timeT start = time(); if ((i:th bit of b)==1) slow process(param); else fast process(param); timeT time = time()-start; if (time > threshold) value |= 1 << i; } a = value; }
A variant of such timing-based flow primitives use threads to do the timing. This helps with stealth:
int result; void slow(int v) { slow process(param); result = v; } void fast(int v) { fast process(param); result = v; } void P'() { a=0; for(i in [0...(bits in b)-1]) { int bit = i:th bit of b; s=thread_create(slow, bit); f=thread_create(fast, !bit); thread_join(s); thread_join(f); if (result) a |= 1 << i; } }There are two implementations of these threading-based primitives, using either one or two threads. This allows you to embed threading code that matches the code that is already in the program.
Kind | Description |
---|---|
counter_int | Copy a variable by counting up to its value. I.e., the assigment
a=b is transformed into
int i,a = 0; for(i=0;i<b;i++) a++; |
counter_float | Same as counter_int, but using a float. |
counter_signal | Copy a variable bitr-by-bit using a signal handler: |
bitcopy_unrolled | Copy a variable bit-by-bit, each bit tested by an if-statement. |
bitcopy_loop | Loop over the bits in a variable and copy each bit by testing in an if-statement. |
bitcopy_signal | Loop over the bits in a variable and copy each bit in a signal handler.
unsigned int value; int bitNo; void handler(int sig){ value |= 1 << bitNo; } { value=0; signal(31, handler); for(i in 0..(bits in b)-1]) { if ((i:th bit of b)==1) { bitNo=i; raise(31); } } signal(31, 1); a = value; } |
file_write | Copy a variable byte-by-byte by writing to a file and reading it back again. |
trivial_clock | Copy a variable by timing a trivial loop. |
trivial_thread_1 | Same as trivial_clock, but using threads for timing. |
trivial_thread_2 | Same as trivial_thread_1, but using two threads instead of one. |
file_cache_time | Copy a variable by timing a file being read with caching turned on or off.
process<(param,nocache): posix_memalign(&buf,pagesize, pagesize); fd=open("/tmp/file.txt", writing); fcntl(fd, F_NOCACHE, nocache); for(i=0; ireading); fcntl(fd, F_NOCACHE, nocache); start = time(); for(i=0; islow process(param): process(param,1): fast process(param): process(param,0): |
file_cache_thread_1 | Same as file_cache_time , but using threads for timing. |
file_cache_thread_2 | Same as file_cache_thread_1, but using two threads instead of one. |
mem_cache_time | Copy a variable by timing a CPU data cache being read with or without caching.
posix_memalign(&buf1,pagesize,64); posix_memalign(&buf2,pagesize,64); slow process(param): for(i=0; i< param;i++){ asm ("mfence\n" "clflush (%0)\n"::"r"(buf1)); sum += *((long *)buf1); *((long *)buf1) = sum; } fast process(param): for(i=0;i < param;i++){ asm ("mfence\n" "clflush (%0)\n"::"r"(buf1)); sum += *((long *)buf2); *((long *)buf2) = sum; } |
mem_cache_thread_1 | Same as mem_cache_time, but using threads for timing. |
mem_cache_thread_2 | Same as mem_cache_thread_1, but using two threads instead of one. |
time_jit | Copy a variable by timing a jitted function.
int freq=0; void foo(input,output) { static void (*foop)(...,...); if (freq==0) { foop = JIT(bytecodes) freq++; } (*foop)(input,output); } slow process(param): freq=0; start=time(); foo(...,...); time=time()-start; fast process(param); freq=0; foo(...,...); start=time(); foo(...,...); time=time()-start; |
Training
All the implicit flow primitives that are based on timing need to be trained, i.e. we have to run tests to see what is a good threshold between slow and fast primitives. There are two major kinds of training, gap and resend. The gap training simply sends lots of "slow" and "fast" tests, and finds a gap between them. The resend training is similar, but uses more sophisticated statistical tests (Wilson's Score Rule) to figure out how many times a bit should be sent in order to achieve a particular confidence level and target error rate. The commands look like this:
--Transform=InitImplicitFlow \ --Functions=... \ --InitImplicitFlowKinds=... \ --InitImplicitFlowTrain=true \ --InitImplicitFlowTrainingTimesClock=50 \ --InitImplicitFlowTrainingTimesThread=50 \ --InitImplicitFlowTrainingGapMaxFailureRateClock=5 \ --InitImplicitFlowTrainingGapMaxFailureRateThread=5 \ --InitImplicitFlowTrainingGapMinGap=90 \ --InitImplicitFlowTrainingKind=gap \
or like this:
--Transform=InitImplicitFlow \ --Functions=... \ --InitImplicitFlowKinds=... \ --InitImplicitFlowTrain=true \ --InitImplicitFlowTrainingResendConfidenceLevel=0.99 \ --InitImplicitFlowTrainingResendTargetErrorRate=0.00001 \ --InitImplicitFlowTrainingKind=resend \The output would look like this (for the resend training kind):
TRAINING_DATA '(mem_cache_time 4496.000000 16 4)'which means that for the mem_cache_time implicit flow primitive, a parameter value of "16" is optimal, that each bit needs to be sent "4" times, and that the threshold between slow and fast processes is "4496". You can then feed this into another run of tigress:
--Transform=InitImplicitFlow \ --InitImplicitFlowTrain=false \ --InitImplicitFlowTrainingData='(mem_cache_time 4496.000000 16 4)' \ --Transform=AntiTaintAnalysis \ --Functions=... \You can also run the training at startup time of your program, i.e., insert your training code at the beginning of main:
--Transform=InitImplicitFlow \ --InitImplicitFlowTrain=true \ --Transform=AntiTaintAnalysis \ --Functions=... \
Implicit Flow Combiners
Implicit flow primitives can be combined for improved correctness and higher levels of resistance to analysis. Combiners are expressed as S-expressions:
type combiner = | (single primitive) | (compose combiner combiner ...) | (select combiner combiner ...) | (majority combiner combiner ...) | (repeat combiner int) | (until combiner int int)Their behavior as as follows:
- (compose a b c) chains together several combiners, i.e. a(b(c)).
- (select a b c) picks one of combiners a,b,c at random, based on input.
- (majority a b c) chooses the most frequently occurring value computed by a,b,c.
- (repeat a n) is equivalent to (majority a a ... a) with n copies of
. - (until a m n) continuously repeats combiner a m times until there is at least n agreements on the resulting value.
You would use these in the AntiTaintAnalysis command's --AntiTaintAnalysisImplicitFlow=... option:
--Transform=AntiTaintAnalysis \ --Functions=main \ --AntiTaintAnalysisKinds=vars \ --LocalVariables=main:b \ --AntiTaintAnalysisImplicitFlow="(repeat mem_cache_thread_1 3)" \Here are some examples of combiners:
(single counter_int) (single counter_float) (single counter_signal) (single bitcopy_unrolled) (single bitcopy_loop) (single bitcopy_signal) (single file_write) (repeat trivial_thread_1 3) (repeat trivial_counter 3) (repeat mem_cache_time 3) (repeat mem_cache_thread_1 3) (repeat file_cache_time 3) (repeat file_cache_thread_1 3) (repeat jit_time 3) (compose counter_int counter_float file_write) (select counter_int counter_float file_write) (majority mem_cache_thread_1 bitcopy_loop counter_signal) (majority mem_cache_thread_1 file_cache_time bitcopy_loop counter_signal counter_float)
Options
Option | Arguments | Description |
---|---|---|
--Transform | InitImplicitFlow | Call this before --Transform=AntiTaintAnalysis, in case you want to use the implicit flow copy kinds counter_signal and bitcopy_signal. This transformation inserts the requisite signal handlers and jitted functions. |
--InitImplicitFlowHandlerCount | INTSPEC | How many signal handlers to insert. Default=1. |
--InitImplicitFlowJitCount | INTSPEC | How many jitted functions to insert. Default=0. |
--InitImplicitFileCacheSize | INTSPEC | Size of the file (in 8k blocks) for a file cache implicit flow. Default=100. |
--InitImplicitFlowJitFunctionBody | S-Expression | S-Expression describing the control structures of the function to be jitted. See the RandomFuns transformation for description of the syntax. Default="(for (for (for (for (if (bb 2) (bb 2))))))". |
--InitImplicitFlowKinds | counter_int, counter_float, counter_signal, bitcopy_unrolled, bitcopy_loop, bitcopy_signal, file_write, trivial_clock, trivial_thread_1, trivial_thread_2, trivial_counter, file_cache_time, file_cache_thread_1, file_cache_thread_2, mem_cache_time, mem_cache_thread_1, mem_cache_thread_2, jit_time, branchPrediction_time, * | Comma-separated list of the kinds of implicit flow that can be inserted. Default=all options.
|
--InitImplicitFlowTrainingKind | gap, resend | How to do the training. Either by simply looking for a gap between slow and fast times, or to do a more sophisticated statistical test. Default=gap.
|
--InitImplicitFlowTrainingParameterRange | S-Expression | An S-Expression consisting of lists of elements of the form (implicit-flow-kind from to), indicating that we will start by training with the parameter set to from, then 2*from, 4*from, until to is reached. Default=(trivial_clock 1 10000) trivial_thread_1 1 10000) trivial_thread_2 1 10000) trivial_counter 1 10000) file_cache_time 1 1024) file_cache_thread_1 1 1024) file_cache_thread_2 1 1024) mem_cache_time 2 64) mem_cache_thread_1 2 64) mem_cache_thread_2 2 64) jit_time 0 0) jit_thread_1 0 0) jit_thread_2 0 0). |
--InitImplicitFlowTrace | BOOLSPEC | Trace the execution of the initialization of implicit flow. Default=false. |
--InitImplicitFlowTrain | BOOLSPEC | Generate a program that runs the training pass. The output will look something like this: T RAINING_DATA '(trivial_counter 779.000000 2) (trivial_thread 0.000000 8192) (file_cache_time 24618781.250000 512) (file_cache_thread 0.000000 2) (mem_cache_time 26890.750000 128) (mem_cache_thread 0.000000 2048) (jit_time 10020289.000000 0)'. This can then be input to a subsequent run of tigress, setting --InitImplicitFlowTrainingData= to this output. Default=false. |
--InitImplicitFlowTime | BOOLSPEC | Generate a program that runs lots of timing tests. The output looks something like this TRAINING_MEASURED_TIME,trivial_counter,fast,example.com,1,208.000000, where the two last entries is the parameter of the implicit flow kind and the time (usually in clock cycles). Default=false. |
--InitImplicitFlowTrainingTimesClock | INTSPEC | Number of training measurements to take for timing-based primitives. Default=0. |
--InitImplicitFlowTrainingTimesThread | INTSPEC | Number of training measurements to take for threading-based primitives. Default=0. |
--InitImplicitFlowTrainingGapMaxFailureRateClock | INTSPEC | Maximum acceptable failure rate for timing-based primitives, out of 100, for the gap training kind. Default=5. |
--InitImplicitFlowTrainingGapMaxFailureRateThread | INTSPEC | Maximum acceptable failure rate for thread-based primitives, out of 100, for the gap training kind. Default=5. |
--InitImplicitFlowTrainingGapMinGap | INTSPEC | Minimum gap between fast and midpoint and slow and midpoint, as a percentage, for the gap training kind. Default=80. |
--InitImplicitFlowTrainingResendConfidenceLevel | real | Confidence level, a floating point number close to 1.0, for the resend training kind. Default=0.95. |
--InitImplicitFlowTrainingResendTargetErrorRate | real | Acceptable error rate for the resend training kind. Should be a floating point number close to 0. Default=0.0001. |
--InitImplicitFlowTrainingData | SExpression | Set training data from previous run of --InitImplicitFlowTrain. Default=none. |
Issues
-
You may have to add -lm -lphtread to the Tigress command line.