Learning Rust Language Topics

types

  • only u8 ints be converted to chars
  • .len returns number of bytes, not the amount of characters
  • . chars().count() returns amount of chars in string
  • add _ to numbers for readability, they are ignored e.g 1_000_000

functions

  • skinny arrow indicates return type
  • if you specify type of arguments in a function, rust infers the type on variables that you pass to the function
  • line without ; indicates value that gets returned
  • last line of function is the return

blocks

  • capture variables in strings in curlies “var: {var}”
  • variable lifetime ends outside of block

logging

  • pretty print datastructures other than strings with {:?}
  • multiline pretty print with {:#?}

variables

  • drill down into class const’s with :: , e.g. u8::MAX
  • make variables mutable with mut
  • you can’t change types of mut variable
  • when shadowing variables in blocks you can recover the original value after the block scope is closed
  • variables can be declared uninitialised, and then used within scopes without dying.

memory

  • memory on stack needs fixed sizes
  • memory on heap can have unknown data length, but it’s slower
  • sometimes heap is not even available (embedded devices)

references

  • rust reference is a “memory safe” pointer: a reference to “owned” memory
  • references point to memory of other value: “borrowing” values
  • references are prefixed with &
    • let my_reference = &my_value
  • mem::size_of::<u8> shows size of type
  • mem::size_of_val(&reference) shows size of value
  • functions cannot return references to values that “die” in the function scope
  • * dereferences a reference. if it is a mutable reference you can use * to operate on the value

ownership

  • you can pass objects that you own around
  • you can pass a references to data
    • then function can view the data but not mutate it
  • you can have unlimited unmutable references to values (read only)
  • you can have only one mutable reference to a value (write)

moving / borrowing

  • when passing an owned object to a function (var: String), the function takes ownership and value will die at the end of the scope. This is called a “move” as the data is moved to a different scope
  • when passing a var: &String, the value doesn’t move or change ownership, value can be viewed not mutated and will not die
  • when passing a var: &mut String mutable reference, value doesn’t change ownership, doesn’t die and is mutable.

strings

  • defining a regular fixed length string results in an &str
    • is a reference with known length to memory ref-stir or string slice
  • a String object has data on the heap, owns it’s data and has functionality to mutate the data
  • format! Macro is like println but creates a String object

floats

  • use convenience functions such as .floor() .ceil() .round .trunc()

copy / clone

  • copy types (ints floats bool) always get copied when passed to a Function
  • strings don’t get copied but are clone-able

printing

  • multiline strings are allowed
  • use raw to print loads of escaped characters println!(r#"string with bunch " of " escapable chars"#)
  • use b to print string as bytes
  • use to print hex
  • use to print pointer adress, as in the memory adress the pointer points to
  • \u{hex} to print Unicode character
  • printing can also specify padding alignment and padding character

arrays

  • typing arrays type first then length [&String; 2]
  • prefill array with let mut buffer = [0; 640]
  • slicing array with position, count. Exclusive: &array[0..2] inclusive &array[0..=2]

vectors

  • kind of like String for &str is Vector for &array
  • Vec<String> vector of strings
  • Vec<(i32,i32)> vector of tuple of 2 i32s
  • all items within vector have the same type
  • vector.capacity() shows capacity in vector before value is copied and doubled, usually 4 when there is 1 value.
  • Vec::with_capacity(8) when you know the future vector size avoids reallocation, which is better
  • vector: Vec<_> = [1,2,3].into()
  • use .extend to merge two vectors into one:
let length: u32 = 2000;
let mut data: Vec<u8> = Vec::new();
data.extend(8.to_le_bytes());

tuples

  • tuples can contain multiple different types
  • access tuple members with dot: tuple.0
  • destructure vectors with tuple definition
    • let vector = vec!["one", "two", "three"]
    • let (a, _, c) = (vector[0],vector[1],vector[2])

match

  • match is like a switch but needs to be exhaustive
  • code block, fat arrows and “arms” for each case
fn main() {
    let my_number: u8 = 5;
    match my_number {
        0 => println!("it's zero"),
        1 => println!("it's one"),
        2 => println!("it's two"),
        _ => println!("It's some other number"),
    }
}
  • “match guards” are if statements within a match
  • matches also work with tuples, allowing you to check multiple things and match them to a specific case, and catching all other cases
  • match stops when it finds a hit
  • match always returns the same type
  • use @ to create a variable for the match value to use it in a print statement
    • number @ 13 => println!("{} is unlucky

structs

  • unit struct, has nothing: struct FileDirectory
  • unnamed Tuple struct, only has types: struct Color(u8,u8,u8)
  • named struct, has tuple with values: struct Light { on: bool, brightness: u8, color: Color}
  • structuring allows you to not have to assign the values if they are named the same:
let on = true;
let brightness = 50;
let color = (50,50,50);
let light = Light {
on, brightness, color,
}

enum

  • use struct if you have many different things, use enum if you have one thing with many options
  • enum States { Play, Pause, Stop }
  • enums combine perfectly with match to name the states:
match state {
        ThingsInTheSky::Sun => println!("I can see the sun!"),
        ThingsInTheSky::Stars => println!("I can see the stars!")
    }
  • enums can contain data when created:
enum ThingsInTheSky {
    Sun(String), // Now each variant has a string
    Stars(String),
}
  • import enums to allow us to use the members directly:
fn match_mood(mood: &Mood) -> i32 {
    use Mood::*; // We imported everything in Mood. Now we can just write Happy, Sleepy, etc.
    let happiness_level = match mood {
        Happy => 10, // We don't have to write Mood:: anymore
        Sleepy => 6,
        NotBad => 7,
        Angry => 2,
    };
    happiness_level
}
  • enums members have indexes, get them by casting the value to an i32 if you change them the numbering goes on
  • if you want to create a vector with different types, you can use enum with multiple values

loops

  • loop {} loops indefinitely until break; is called
  • you can name loops with 'loop_outer: loop { break 'loop_outer; }
  • for number in [0..2] { } to loop, use _ if you don’t need the variable
  • break can return a value that can be assigned from the loop

implementing structs and enums

  • methods: take “self” and mutate on object
  • associated functions: don’t take self, are related
enum Mood {
    Good,
    Bad,
    Sleepy,
}

impl Mood {
    fn check(&self) {
        match self {
            Mood::Good => println!("Feeling good!"),
            Mood::Bad => println!("Eh, not feeling so good"),
            Mood::Sleepy => println!("Need sleep NOW"),
        }
    }
}

fn main() {
    let my_mood = Mood::Sleepy;
    my_mood.check();
}

attributes

  • added to structs and enums #[derive(Debug)]

dereferencing

  • when using the dot operator . you don’t have to worry about dereferencing a reference:
    • let reference_item = &item
    • let double_reference_item = &reference_item
    • double_reference_item.compary_number(8) works

generics

  • Used if there are multiple types able to be passed to a function
    • fn print_value<T>(value: T) { println!("{}", value)}
  • Generic is usually called T
  • specify which trait, for instance debug, is implemented on a generic:
    • fn print_value <T: Debug>(value: T)
  • you can specify that a generic can be one of multiple traits by using +
fn some_func<T: SomeTrait + OtherTrait>(item: T) -> bool {
	item.some_function() && item.other_function()
}
  • if you want to compare generics, you need PartialOrd

option

  • use option as return type when you deal with values that might exist, or might not exist
  • use None and Some in code where you need to return an option
fn take_fifth(value: Vec<i32>) -> Option<i32> {
	if (..) {
		None
	} else {
		Some(value[4])
	}
}
  • the None or Some values can be “unwrapped”. You only want to unwrap if you are sure there is a value inside
  • we can use a match on an option to safely deal with the None value
  • we can use is_some() on an option to see if it contains a Some value
  • we can use unwrap_or() to provide a fallback value if the value was none
  • if we match an optional struct, we can use the ref keyword Some(ref struct) to get the reference to the value instead of borrowing the value

result and error handling

  • result is similar to option, but about Ok or Err instead of no value or value
  • take extra care defining the type of a function when defining an result.
    • A common mistake is to make result return one of the “Error” types, but you should define the type of the thing that is wrapped in the Err
fn generate_error() -> Result<String, String> {
	if(1 == 1) {
		Ok("it was 1".to_string())	
	} else {
		Err("wasnt true".to_string())
	}
}
  • you can define your own errors, similar to this standard library function
enum CreationError {
    Negative,
    Zero,
}
impl PositiveNonzeroInteger {
    fn new(value: i64) -> Result<Self, CreationError> {
        // TODO: This function shouldn't always return an `Ok`.
        if value < 0 {
            return Err(CreationError::Negative);
        }
        if value == 0 {
            return Err(CreationError::Zero);
        }
        Ok(Self(value as u64))
    }
}
  • implementing naive error handling:
fn give_result(input: i32) -> Result<(), ()> {
	if input % 2 == 0) {
		return Ok(())	
	} else {
		return Err("input couldnt be processed")
	}
}

fn main() {
	if give_result(5).is_ok() {
		println!("it was ok");	
	} else {
		println!("it was an error");	
	}
}
  • if you encounter a result, you can if let the error case and continue to safely unwrap the value:
let stream = TcpStream::connect(args.host);

if let Err(e) = stream {
	eprintln!("Failed to connect to host: {}", e);
	return Err(e);
}

let mut stream = stream.unwrap();
  • or, use match and handle the error in one block
let mut stream = match TcpStream::connect(args.host) {
	Ok(stream) => stream,
	Err(e) => {
		eprintln!("Failed to connect to host: {}", e);
		return Err(e)
	}
};

if let and while let

  • if you only want to do something when the value is not None:
let my_vec = vec![2, 3, 4];
for index in 0..10 {
	if let Some(number) = my_vec.get(index) {
		println!("the number is {}", number)
	}
}
  • you can do something similar with while let, e.g. popping a vector until it returns none while let Some(information) = city.pop()

hashmap

  • hashmaps are not ordered
  • create with HashMap::new()
  • populate with hashmap.insert(key, value)
  • hashmap.get() returns an Option
  • inserting with existing key will overwrite value
  • the .entry() method returns a mutable reference to the entry if the key exists, allowing you to change its value
  • if we chain .entry(key).or_insert(0) the key with value 0 will be inserted into the hashmap if it didn’t exist yet
  • we can then use that reference to for instance increase a counter *reference += 1;
  • you can also use or_insert to create vectors with items as values

btreemap

  • btreemaps exactly like hashmaps but are sorted by key
  • iterate with for (key, value)

hashset

  • a hashmap where every value is ()
  • used to keep tabs on whether a key exists or not
  • use hashset.get(&number).is_none() to check if keys exist

BTreeSet

  • same as hashset but ordered

binary heap

  • is like a hashset where the first element is always the biggest value, the rest is unordered
  • used for instance to create prioritised lists, for example using a BinaryHeap<(u8, &str)> tuple for tasks

VeqDeque

  • vector made especially for being efficient at popping elements from both front and back of the array.
  • use pop_front() to pop items on the front of the array
  • pop_back() and pop_front() return an option, options need to be .unwrap()ped

the mighty ? operator

a lot of people just questionmark out the errors really quickly, uyou end up handling an error 10 functions higher, with no indication to where this error happened — theprimagen (https://youtu.be/7ySVWcFHz98?si=33xKgzpYc4YGZAds&t=1039)

  • warning: theprimagen warns you for becoming lazy by using ? operator instead of handling errors
  • adding question mark operator ? returns if the result if it was ok and passes error if it was an err
  • compiler can help us find the err types if we try to call an non-existant function on an item:
let failure = "Not a number".parse::<i32>(); failure.rbrbrb(); // ⚠️ Compiler: "What is rbrbrb()???"
error[E0599]: no method named `rbrbrb` found for enum `std::result::Result<i32, std::num::ParseIntError>` in the current scope
- ? operator is useful when working with files, as each step of loading, writing file can produce errors

panic & assert

  • use panic!("this and this should never happen wrong") when something unrecoverable happens, to make program end immediately
  • asserts takes an expression and a message, allowing you to do runtime checks on things that should never happen:
    • assert!(x > 3, "x should never by higher than 3")
    • assert_eq!(x, 3)
    • assert_neq!(x, 4)
  • primagen on asserts https://youtu.be/7ySVWcFHz98?si=uxTtx2gD06uDjRIP&t=2114

traits

  • declare some behavior, e.g. “what something can do”
  • it feels like overwriting a global prototype, but actually the implementation only happens if you import the trait in the scope of your code
  • most standard objects implement the Debug, Copy and Clone traits
  • we can give our own structs these traits by using macro #[derive(Debug)]
  • you can implement traits manually with impl
  • define a struct: struct Animal { name: String }
  • define a trait: trait Barks { fn bark(&self); }
  • implement that trait on a struct:
impl Dog for Animal {
	fn bark(&self) {
		println!("{} is barking", self.name)
	}
}`
  • another common trait is From, to turn an object into an object of another kind
  • impl From<Vec<City>> for Country { fn from(cities: Vec<City>) -> Self { Self {cities}} }

trait bounds

  • traits can be implemented without functions and used as traitsbounds
  • these traitsbounds can then be used to narrow down generics for a function:
impl Magic for Wizard{}

fn fireball<T: Magic + Debug>(character: &T, opponent: &mut Monster, distance: u32) {
	...
}

asref and where

  • AsRef is a trait gives you a reference to the object as a different type
  • you can use AsRef<str> is a traitbound to make a function accept more than one type
  • if you have more than one condition to the generic, you can use the where keyword to split up into multiple lines
fn print_it<T>(input: T)
where
	T: AsRef<str> + Debug + Display,
{...}

chaining

  • we can create functional style method chains, passing the returned object to the next function
  • when starting from a vector, we need to turn into an iterator.
  • after finishing, we must collect the iterator to turn it back into a collection
let range = 1..=10;
let vec = range.collect::<Vec<u8>>();
println!("numbers 1 to 10: {:?}",vec);

let subvec = vec
	.into_iter()
	.skip(2)
	.take(3)
	.collect::<Vec<u8>>();
println!("{:?}",subvec);
  • collect also allows us to specify the type into what we want to collect
  • collect uses type hints to determine what type to collect to

iterators

  • three types of iterators:
    • .iter() gives an iterator of immutable references
    • .iter_mut() gives an iterator of mutable references
    • .into_iter() gives an iterator of values (not references)
  • iter() creates an iterator over immutable references, letting you traverse data without taking ownership. In contrast, into_iter() consumes the collection, transferring ownership and making it possible to modify or transform each element
  • a for loop is an iterator that owns its values
  • borrowing note: after calling .iter() on a value, we can still access the original value, as we got immutable references and values were not moved
  • iterators are lazy, they only work when they are being consumed
    • a map without collect doesn’t fire
  • use enumerate() to get both index values as a tuple
  • use char_indices() on a string to get index and number
  • use windows(len, |window| ... ) to iterate over vectors with slices of len

functional

  • use map to pass each to a new collection
    • take extra care to not use the ; at the end of statement in a map, or it will not return that value
  • use for_each to do something to every item
  • filter, reduce etc. are available
  • filter_map first map, and then automatically filter all the None values out.
  • use .ok() on a method that delivers result to easily chain into a functional call
    • filter_map(|input| input.parse::<f32>().ok())

closures

  • define closures with || {}
  • declare new variables in a closure: |x:i32| { }
  • you can use variables outside of the closure inside the closure: let outside = 5 |i:i32| { o+i }
  • in addition to unwrap_or we can also use unwrap_or(value) unwrap_or_else(function) to provide a closure to handle an unwrapping on none
  • use |_| if you dont need to use the variable in a closure
  • when returning an option, use .ok_or("error message") to use the value or pass an error
  • when dealing with
  • TODO dive deeper in and_then() ✅ 2024-09-23
    • you can use and_then to immediately attach a callback if a returns an error:

serialisation

  • use serde to serialize enums. For example, turning an enum into command line options:
#[serde(rename_all = "kebab-case")]
enum Command {
    Friendly,
    Fuzzer,
    EIP,
    Exploit
}

command line arguments

  • use clap to parse cli arguments from a struct:
#[derive(Parser, Debug)]
#[command(name = "Exploit-Rust")]
#[command(version, about, long_about = None)]
struct Args {
    command: Command,
    host: String,
    return_address: String
}
fn main() {
    let args = Args::parse();
}

simple splitting in files

  • move function to separate file, add pub to fn
  • use following lines to import function
    • mod filename;
    • use crate::interact::interact;

some, and, find, fold

  • use some() and all() functions in combination with mapping closure to check vectors for certain values and deal with none values
let some_are_none = vec![Some("yes"), Some("yes"), None];
let result1 = some_are_none
	.iter()
	.all(|x| x.is_some());
  • very useful when dealing with multiple async calls at the same time
  • rev() reverses the iteration from the back of the iter
  • .find()
  • fold() also works with options / none values:
    let some_are_none = vec![Some(1), Some(1), None];
    let folded_total = some_are_none
        .iter()
        .fold(0, |total_so_far, next| total_so_far + next.unwrap_or(0));
    println!("folded total: {}", folded_total);

debugging

  • use dbg! to print debug information
  • use BACKTRACE=1 environment variable to enable traces

lifetimes

  • you cannot keep references to values that die
  • for strings, this is often fixed by making giving the string a static lifetime: line: &'static str
  • lifetimes are passed just like generics. struct City<'a> means inputs should live just as long as the structs

mutating inside structs

  • instead of making a full struct mutable, we can use Cell to make only specific variables mutable
  • when using refcell we can also create new (mutable) references from a cell
    • the borrow checks are only executed on runtime, so compiler will allow a double borrow

mutex

  • if a value is declared as a mutex let my_mutex = Mutex::new(5), we can at some point lock it, and mutate it using the created mutex guard object: let mutex_changer = my_mutex.lock().unwrap().
  • locked values are unlocked
    • at the end of the lifetime of a mutex changer reference
    • or when explicitly dropped: std::mem::drop(mutex_changer)
  • you don’t need to use a mutex changer variable, you can change the mutex immediately, just don’t forget to dereference the value *my_mutex.lock().unwrap() = 10

rwlock

  • is kind of like a mutex and refcell
    • many .read is good
    • one .write is good
    • read and write together is not good
  • first drop the writable reference before trying to read
  • use try_read and try_write to avoid locking up

cow “clone on write”

  • TODO look into COW

rc

  • if we want to use heavy objects multiple times, we can use RC to create multiple references to the same value
  • TODO look further into rc

multithreading

  • use std::thread::spawn that takes a a closure that will run on separate threads
  • use the handler and handle.join() to wait for the threads to finish
  • use move before closure to move a value into the thread

todo

  • use todo!() macro in unimplemented methods

arc “atomic reference counter”

  • if you are going to use variables in multiple threads, we first create a mutex for the value and protect it inside an Arc:
let my_number = Mutex::new(5)
let my_number1 = Arc::clone(&my_number)
std::thread::spawn(move || {
	...
})

multithreading and chunking

  • when dealing with large datasets, use chunking to load part of the data in memory and then use rayon to parallellise the processing
  • use Arc’s and AtomicBools to keep track of state amongst threads
let password_found: Arc<AtomicBool> = Arc::new(AtomicBool::new(false));
let mut current_chunk: Vec<String> = Vec::with_capacity(CHUNK_SIZE);
let mut found_password: Option<String> = None;

  • then use par_iter on the chunk inside a chunk processing function
  • use arc.load and arc.store with Ordering::Relaxed to check if other threads already completed, and store the result if something is found something is found
   if found.load(std::sync::atomic::Ordering::Relaxed) {
       return None;
   }
   ...
   if password.is_some() {
       found.store(true, Ordering::Relaxed);
   }
   password.cloned()

defining your own closures

  • TODO look into closures in functions

channels

  • use std::sync::mpsc multiple producer, single consumer many threads send to one place
  • a channel is like an arc that can be cloned to other threads
  • use a join handle to make them wait

attributes

  • typical attributes that are used
    • #[allow(dead_code)]
    • #[allow(unused_variables)]
    • #[derive(Display)] add display functionality
    • #[cfg(test)] only run in test configuration
    • #[no_std] no standard lib, for devices with small memory/space

testing

  • define tests inside mod with cfg(test), annotate each test function with #[test]
#[cfg(test)]
mod tests {
    use super::*;
    #[test]
    fn types_returns_char() {
        let result = types();
        assert_eq!(result, '=');
    }
}

box

  • box is a smart pointer, it is similar to a reference.
  • box allows you to put a type on the heap instead of on the stack
  • you can use box if you want to do
    • recursion in structs,
    • dynamic memory allocation
    • dynamic dispatch (trait objects)
    • transferable ownership without incurring clone
    • define the return type of an error enum
#[derive(PartialEq, Debug)]
enum CreationError {
    Negative,
    Zero,
}

impl Error for CreationError {}

fn main() -> Result<(), Box<dyn Error>> {
    let pretend_user_input = "42";
    let x: i64 = pretend_user_input.parse()?;
    println!("output={:?}", PositiveNonzeroInteger::new(x)?);
    Ok(())
}

dyn

  • dyn tells the compiler that the object we are passing is a trait

default

  • default is like calling new but without an argument, creating a default value
  • you can create a builder pattern with multiple defaults

deref

  • deref is similar to using * on well known implemented references
    • Vec points to an [] and implements Deref
    • String points to an &str and implements Deref
  • you can implement your own Deref to allow “smart” pointers to be dereferenced
  • implement DerefMut to also allow mutating the struct

crates

  • use mod to create a space for your code
  • helps with structuring and reading your code
  • creating private scopes to encapsulate functionality
  • only code with pub is publicly available, everything is private by default
  • for structs, need pub for each public member to expose

cargo

  • use cargo add <cratename> to install a crate and add the dependency to the toml file
  • when you want to use derive macro’s, you have to enable the derive feature: cargo add serde --features derive

async/await

  • await is a tool for writing async functions that look like synchronous code
  • in asynchronous programming, operations that cannot complete immediately are suspended to the background, and the thread can continue doing other things. when the operation completes, the task is unsuspended and continues processing
  • calls to .await yield control back to the thread
  • async bodies are lazy, it will only run when await is called on it.
  • async closures can have values moved to them, just like normal closures
  • tokio crate is most used, smallest alternative is smol

future

  • async code implements a trait called “Future”
  • use ((),()) = futures::join!() to join multiple futures and return when they are finished
  • async functions need to be executed by a runtime
    • use runtime: futures::executor::block_on(future) to call an async function

net

  • use .to_socket_addrs on a domain:port string to turn it into a socket address

WIP: a tour of the standard lib (https://dhghomon.github.io/easy_rust/Chapter_60.html)