Category Archives: dada

DADA Week 8 Writeup

Our speaker for this week was Eric Peterson, a Research Manager for McAfee. He said he went to school hoping to become a pilot, but things changed and he ended up working with research in security.

The first thing we did was a “Phishing Quiz” that was designed by the McAfee team to see how good we were at detecting if an email was trying to phish us or not. We had to use clues and our experience to determine this. The quiz is linked at http://phishingquiz.mcafee.com/. I don’t think the link works anymore, but it’s in the the video lecture. I think the class got around 80% of the emails correctly marked as spam or not spam.

Terms

There were a ton of new terms that he wanted us to know before moving on. Here are a few of the important ones:

  • Spam/ham
    • Spam is exactly what you think, it’s spam/garbage email sent by spammers to spread their crap. Ham is anything that’s not spam/good email.
  • Spam trap
    • Email address used to get spammer’s emails to identify who they are.
  • Phishing
    • Trying to get people to enter their real information by sending fake webpages/emails.
  • RBL
    • Real-time Blackhole List, or RBL, is a list of sites that for sure send spam and should be avoided.

Tools

We learned about a few tools used to research spam and other related emails. A few that Linux uses are DIG, WHOIS, GREP, SED, and AWK. DIG is the domain information groper, which prints information about a specific webpage/domain. WHOIS also serves a similar purpose, but it gives IP/domain registration information. GREP lets us search for specific strings within files. SED is a editor used for modifying files, usually text files. AWK is a programming language used for text processing and extracting data.

He also listed two types of SQL database types, PostgreSQL and MySQL. PostgreSQl is an object relational type database, it is considered the most advanced database management system available. MySQL is the most popular database management system. I have only used MySQL in my first databases class, so PostgreSQL is new to me. We ended up doing a short demo using PostgreSQL in our VM’s.

There is also a tool named Regex Coach, which helps users learn how to create and formulate regular expressions to search and match specific strings. We ended up using this one in the lab. I had made regular expressions in my CS 321 class, Theory of Computation, but I had any real life application for it. It turns out making regular expressions is pretty complicated and needs a lot of careful planning.

The other two tools listed were just websites that hosted information that researchers used. They were trustedsource.org and spamhaus.org.

Demos

We first accessed the PostgreSQL server on the Linux VM’s. We didn’t do much other than that, the video cut off and we were already moving on to the next section.

The next part was to use Regex Coach to create regular expressions for the strings

“v | a g r a”  “\/iaGra” “V|4agra” but not “Viagra”

I used an online Regex learning too that functioned just as well as Regex Coach.

w8-1

RegExr, an online regex learning tool.

The way Eric showed to do it in class was to work through one letter at a time so you could catch more strange variations with a shorter regular expression. The way I did it was to just catch the individual strings.

In the next lecture we started learning about emails. The first thing we looked at was email headers and SMTP conversation for a ham email. The main difference that happens between a ham and spam SMTP is that the spam will be caught out and blocked, which sends a 500 level number which indicates a failure in sending. 200 numbers mean OK. We also learned to read email headers from bottom to top.

The next section was learning to identify what spam or ham is based off observations.

w8-2

The spam.

w8-3

The ham.

The things we that stood out to me were the php extension in the link, the European name and domain, and the fact that it sounded like clickbait and was not very personal. The class identified that Oprah is used a lot to get her fans to click, it was HTML based, and that there were no periods. They did make more observations, but these were the main ones.

The ham email stood out to me as clean because there were no links/URLs, it’s not a random subject(it’s a Newsletter) and it was formatted, so you know it would take time to make. The class pointed out that it was “hippie targetted” and there were no greetings/salutations. Then we made a shape that would represent the email’s features. The more similar an email is to that shape, the more likely it will be the same type, spam or ham.

Lab

Our lab this week is extremely similar to the classification we did for the malicious URLs, but this time it is using PostgreSQL. This time, he did not tell us the percentage of spam vs. ham in the 100,000 rows of real-world message meta data.

My method was to group by the subject line and mark any that are seen over 1000 times. I viewed the highest counted subjects and found that most of the subjects over 1000 or so were definitely spam related(they were mostly related to increasing stock value). The most common one was a 61k counted piece of spam. That made up a majority of my 70k or so. I also wanted to do similar queries with source IP’s and other similar fields, but my SQL skills were lacking.

w8-5

First I had to clear all the original marks.

w8-4

This was my main query.

 

 

 

DADA Week 7 Writeup

This week’s speaker was Cedric, and our subject was Web Security, which is a subset of Network Security from last week but more focused on web based technologies such as browsing, URL’s and webpages.

An interesting fact we learned was that 95% of malware is delivered via the web. This doesn’t really surprised me because where else would you get malware from aside from someone sticking a USB onto your computer or some other inconvenient form of physical input? No one would bother making malware if they had to leave their basements and walk around trying to physically inject code.

Users are the weakest link in the security of a system. Not even functioning operating systems or browsers can protect an someone who can’t protect themselves. Most users of the internet are described as “impatient, lazy, self-proclaimed omniscient” or just refuse to stop clicking on whatever pops up in their faces. Makes sense since using a computer is less than a hobby for most people who have never personally dealt with malware.

Forms of Attack

Phishing, where fake links to popular webpages are sent to unsuspecting people, where they enter their valuable information such as username and password, banking info, etc.

SEO(search) poisoning, is when hackers identify popular trends on search engines using Google Trends or something and getting their malicious links onto those searches to increase the likelihood of getting clicks. Apparently in 2014 Jimmy Kimmel was the top malicious related search. Which is weird to me because he lacks any semblance of personality and is a boring host, but he panders to the biggest crowd so his popularity is high.

Fake updates and fake antivirus,  which mimics the actions of real/proper antivirus software but doesn’t actually do anything. They will pop up in your face relentlessly once installed, telling you that your system has been infected with digital ebola and for the small price of $99 you can clean it off your system for good. Pretty ironic.

URL obfuscation is making malicious links/URLs look legitimate. Here’s the link for Microsoft: http://rnicrosoft.com/.

Malvertising is when hackers post ads with malware relations onto legitimate sites so people will be redirected onto bad sites from a good one.

Waterhole attacks are when hackers find commonly visited sites by prime targets(people with no antivirus or lack of computer knowledge) and injecting exploits onto those sites so they can get easy access to their target’s systems. The hacker can then identify any more “waterhole” sites to inject, and start again from there.

So many attacks, but is there any hope for defending? There are methods such as URL/domain reputation systems, which provide real-time protection in browser or network device, and site certification services to identify legitimate webpages. The best and final defense will always be user education.

Lab 1:

The first lab had us look at an application named WebGoat, which is a “poorly-secured web application, using three common attack types”. It was built from the ground up by the OWASP foundation(Open Web Application Security Project). The “goat” is basically an easy target for malicious attacks, and we host one of those on our virtual machines so we can try some of our own attacks against a dummy website.

3723eaa0-f5bc-0132-44e3-0a2ca390b447

Our prey.

The tools/weapons of choice are Tamper Data, BurpSuite and WebScarab. Tamper Data was installed as an add-on onto Firefox. I couldn’t find the other two, but I could do what I needed to do with just Tamper. Tamper Data is a tool that allows us to modify and view HTTP/HTTPS headers and post parameters.

screen-shot-2017-03-03-at-3-03-41-pm

The files provided, Linux edition.

 

To start the goat up, we used the command:

sudo sh ./webgoat.sh

And then:
sudo sh ./webgoat.sh start8080

This would allow us to access the WebGoat webpages. The pages were located on the URL:

http://127.0.0.1:8080/WebGoat/attack

No internet was required, we did this with a locally hosted page.

screen-shot-2017-03-03-at-1-07-55-pm

The initial goat page.

The first attack we performed was the Stored XSS, or cross-site scripting. This means injecting client-side scripts into other people’s browsers, a form of code injection. The scenario was to log into this set up goat login page as the employee Tom, and change something which would cause Jerry(human resources(hr?)) to see a popup.

screen-shot-2017-03-03-at-5-25-28-pm

The login page.

To do this we had to put some simple javascript into one of the fields. I changed Tom’s old street to:

alert("hi");
screen-shot-2017-03-03-at-5-28-33-pm

The hax.

This just makes a popup in the browser Jerry is using, saying “hi”. So now you log into Jerry’s account and view Tom’s profile. You will see this:

screen-shot-2017-03-03-at-5-31-03-pm

Get hacked son.

Mission accomplished.

The next form of attack: Improper Error Handling(Fail Open). This is exploiting a lack of error handling cases to access things you shouldn’t. In this instance, we had to delete one of the login parameters, either login or password, which would allow us to login.

Enabling parameters allows us to see what sort of data is being passed. Logging in would show two more parameters, “username= ” and “password= ”

screen-shot-2017-03-03-at-2-44-21-pm

The second page with “Show Params” enabled.

A hint said to force a parameter to point to a null pointer. The proper way to do this was to use WebScarab and delete username or password. Since I couldn’t find either of those, I just created a null myself.

screen-shot-2017-03-03-at-2-53-08-pm

Password is null.

Which allowed me to force a login.

screen-shot-2017-03-03-at-2-51-16-pm

The second attack complete.

The last attack is Numeric SQL Injection. SQL is the querying language used to manage data. This means sticking some SQL code where it doesn’t belong. Using Tamper Data, I could modify the post parameters being sent.

We were supposed to inject SQL so we could display all the weather data from every station in one table rather than just a single station at a time.

SELECT * FROM weather_data WHERE station = 101

to

SELECT * FROM weather_data WHERE station = 101 or 1=1

Which returned the entire table at once, rather than one station at a time.

 

 

We were also asked to do the same thing but through modifying the GET parameters instead. The result is displayed below.

lab1-2

Homework

We had to create a web form vulnerable to reflected cross site scripting then fix it. The other option was to do the same thing but for our professor’s old site. It was vulnerable, but there was no python code in sight, so I went with option 1.

hw-ref

The html page vulnerable to reflected XSS.

hw-ref3

My webpage. Uses GET to echo out a name and email.

hw-ref2

The fixed page. No longer creates an alert box.

hw1

Option 2. Hack the processor’s old site. It was vulnerable for reflected XSS attacks but there was no form to modify.

DADA Week 6 Writeup

This week was about Network Security. We had 2 labs and some homework. Our speakers were Ram Venugopalan and Geoffrey Cooper.

Network Security is an incredibly important part of the defense against dark arts. The internet is becoming a more and more important and populated place everyday, millions of users connect to it daily, uploading and downloading. We need this form of security to keep our data safe from external sources(that don’t need to see it), and keep the baddies away from us. There are tons of network based threats out there.

  • Viruses and worms are downloaded onto computers via the internet most of the time. They are programs that contain harmful code that will try to hurt your system and try to steal your data.
  • Trojan horses are very common pieces of malware that try to act like they do one thing, but actually just want access to your stuff. I very often see fake virus scanners that are actually viruses themselves.
  • Botnets are infected computers that act without their owner’s consent to do bad things on the internet. Also called “zombie” computers.

We learned about a few ways that Network traffic can be exploited or messed with. The one subject that stood out to me the most was the Denial of Service attack(DOS). This is when some user’s network becomes unstable and interrupts the user’s access. The most popular version of this is the Distributed Denial of Service attack, where a bunch of computers/systems spam one network with packets/traffic to where the network will lag or straight up go down due to stress. I knew about this before this class because I know a lot of video game servers become targets of DDoS attacks because they’re pretty simple to do while being hard to defend against. Usually it’s done to crash a multiplayer game or for trolling/malicious reasons. Unintentional DDoS can happen when tons of people try to visit a webpage at once, which is what can happen when a popular thread takes off on Reddit and millions of Redditors try to view/access a linked webpage in the thread.

Ways of defending against this include firewall proxies, checking for spoofed addresses, and traffic scrubbing centers, which prevent illegitimate traffic from pinging/accessing the target network.

Lab 2:

Geoffrey designed this lab as a way for us to get acquainted with sorting through network traffic and analyzing what we could get without actually looking at packets. We were provided a virtual machine running some distro of Linux along with some python and perl scripts. Two CSV(comma separated value) files were provided with tons of packet info like source and destination ip, ports, etc. One file is R.csv and the other was O.csv. Apparently by looking at patterns seen in the occurrences of each IP and port would provide us with a look into what these packets came from and what they were for. While I understand what he was trying to get at, this ended up being more of a Python learning experience than network security.

The most common ports for TCP were 139 for R and 137 for UDB. O had 25 for TCP and 5001 for UDB. The services file in the /etc/ directory specifies what each specific port number is used for. R’s TCP and UDP ports was most likely related to a network service, NETBIOS, which is used by computers to connect to the local area network(LAN). The O ports were probably from a SSH protocol and some sort of Yahoo messenger vulnerability exploit.

The next section wants to “investigate IP addresses”. R’s most used ipsrc was 10.5.63.230, which was counted 43338 times, with 234.142.142.142, seen 42981 times, for the ipdst. The network prefix that seems to be seen the most is class A. In O, 192.245.12.221 was seen 169,643 times for the ipsrc, and 192.245.12.221 with a count of 118,662 for the ipdst. O’s dominant class is class C.

The next sections wanted to identify which ip addresses used the IP protocols: generic routing encapsulation(GRE), internet protocol security(IPSEC) and open shortest path first routing protocol(OSPF). To do this, we have to sort through the output from previous code and find the protocol number corresponding to each protocol, which are 47, 51 and 89 respectively. R had no occurrences, and O has a few, GRE has 2626, IPSEC is 1484, and OSPF had 24. This info confirms that the network is most likely some sort of casual home network due to the small amount of IP’s.

We used Wireshark, a network packet reader to examine some packets between a guy and a woman who were apparently exchanging sensitive info. We were tasked to find their meeting day, and where it would take place.

lab1-q1

Looking at their IRC chat, we can see that they wanted to meet Wednesday.

Our hint was to find a Truecrypt file and uncover its contents. A TCP conversation contained a 81200 byte file. I made the type into .tc so we could run the Truecrypt program on it, but it had a password. Another message from the women contained the password. It is shown below.

lab1-q2

Opening the .tc file showed two files, an image and a .txt. I was totally surprised it actually worked.

lab1-q2-3

Apparently they were planning to meet in Vegas. No specific location in Vegas was mentioned, but this was probably all we needed to find.

lab1-q2-2

Lastly, someone clicked a malicious link, and the packets were captured. We were to find the source. It ended up being a piece of  Javascript that originated from some malicious website named paimai or something like that.

lab1-q3

Homeworks: Robustness and Web Policies

We were asked to highlight this principle on network software written 35 years ago. Red for what we disagreed with and green for what still holds today.

robusthw

We were asked to “create the policy for the zone diagram” shown below. Was pretty confusing since we didn’t go over it much in class and I had to just google every possibility.

6-hw2

6-hw

Week 5 DADA Writeup

The speaker for this week was Aditya. The first things we went over this week were what rootkits were, and how we needed to better understand how processes and memory worked so we could better understand how to deal with rootkits. Dealing with rootkits helps us better understand Windows security, which is the biggest target of malware nowadays.

Something that I noticed was that almost all operating systems now are 64 bit, rather than the old 32 bit. I know back when I was in high school and before that, all of my computers were exclusively 32 bit. This change is good for users because 64 bit kernel is harder for rootkits to infiltrate, but it can be done. Methods include bypassing driver signing checks, modifying the windows boot path(preventable with the use of secure boot), kernel exploits such as third party drivers, and stealing valid signatures.

So how do kernels use memory? Kernels use a flat memory model, which means that there is no security features present and the CPU can access any of the memory. The windows kernel(system32) is made up of mostly windows kernel and driver code.

Agony Lab

We had to use Cuckoo, but we were also told we could use other tools if we weren’t great with it, such as Regshot.exe.

It created 3 files in cuckoo named 630669781, 7449788859, and 804340447. Inside each was one file, in the same order: bad.bin, tzres.dll.bin, and sortdefault.nls.bin.

We then used the command line to search around for different files. One file that we found using the “dir *.sys” command was wininit.sys. This file is not visible without the use of tools or search commands.

We then used the tool Tuluka to view “suspicious” files. 3 files popped up, highlighted in red: wininit.sys in analyzer. It called the 3 functions, NtEnumerateValueKey, NtQueryDirectoryFile, and NtQuerySystemInformation. The original and current columns refer to the pointers in memory.capture

Next we used the livekd.exe tool, which reads kernel memory, to see what exactly is happening there. Running the “u <address” command shows what’s happening at that memory address. We ran this command for both the original and current pointers of the wininit.sys file. We are also able to look at modules loaded.

One cool thing we did was right click and use the “restore service” function to return the pointer to its original state. This dropped one of the suspicious activities back to normal, and allowed us to see the wininit.sys file in the analyzer directory once again.

Thread Basics

When a application, such as word or excel executes some kind of arithmetic code, it gets translated to assembly code, then machine code. This machine code is passed to the RAM first before any execution happens. If another program wants to do something similar, it will be scheduled after the first app. The thread scheduler decides what gets to get executed first. When they are executed, the CPU takes the code through an instruction pipeline, which returns the result back through the bus, to the application. Multi-threading was introduced where there can be multiple instruction pipelines to increase productivity in code execution, then later came multi-core CPUS ,which can multiply productivity(I have a quad-core laptop). Having multiple cores makes multi-tasking better, but doesn’t make processing power much stronger.

Processes have their own memory and boundary. They are implemented as objects, and an executable process may contain more than one thread. Each process has an object table that has handles to other known objects, and each process needs a thread to execute.

Process Hacker

This tool is extremely similar to what Process Explorer shows us, but it can show the how the process’ virtual memory changes of each process in order. It also shows the memory contents at any of these points.

We used a malware file named zbot, which contained a .bin and malware.exe. We looked at the affect of malware.exe on notepad. One notable thing that was in the memory, was there are now tons of Private(Commit) actions in notepad’s memory, and they had read write access. This is an example of Process Injection, where an outside interference will inject code into another process.capture

WinDbg

We got a chance to use two separate VM’s to analyze the kernel of one VM in another using WinDbg. WinDbg could break and freeze the VM just like it could break a program. The exercise we were asked to do was repatch the modified pointers in memory that were changed by the Agony malware using WinDbg. Patching/changing the pointers back to their original pointers leaves the malware, but the code won’t ever be executed, basically neutering it. Apparently there’s a chance that doing this will kill the VM and “blue screen of death” the computer, but if done right, the hook should be gone.

Bootkit

We learned about a type of malware named bootkit that hooks/patches Windows throught he master boot record specifically. Apparently some of the first malware was a bootkit. Inn 2015 there was a bootkit named “South Korean Viper” that messed up a bunch of computers in South Korea, to the extent that they could not start.

This was one of the more confusing weeks for this class, learning about the kernel and patching bytes was confusing, but it’s definitely interesting. Re-learning about threads and processes reminded me of my time in my operating systems class, I struggled in that class so I definitely did not have that much fun programming with processes again.

Week 4 DADA Writeup

Our speaker this week, Brad, introduced us as the type of guy that is more on the Dark Arts side of DAtDA. His job focused on exploiting vulnerabilities and  revealing them to clients.

Windows uses WinDbg, which is basically Windows’ version of GDB. We got to try it hands on with a program that he developed named FSExploitMe. It’s an ActiveX based exploit that works only on Internet Explorer. Starting it up then attaching WinDbg to the process would allow us to see its effect on the stack/heap. Most of the code looked like Assembly, which was something I learned nearly 2 years ago, so I didn’t remember the registers that well, fortunately, Brad set up a great demonstration with willing volunteers to help demonstrate how a function manipulates the stack.

 

Memory Corruption

The main point of vulnerability we focused on was “memory corruption”. That means to access memory in a way that causes an undefined behavior, or unintended behavior. Some examples of this are uninitialized memory, or array index calculations.

Normally to exploit something, an “exploit” needs to be created, usually this is some code/data that is passed to a program that will cause a condition. In an internet browser, an exploit would be a set of javascript calls. This first step would be called the “vulnerability trigger”. After that trigger happens, the “payload” or the action that happens as a result of the trigger, will occur. Usually this is “shell code”, or just some kind of assembly code that is run in a command line/shell. In Windows, the shell code is starting the built in Windows calculator. Apparently being able to run an application outside of the original program means you’ve “hacked” the system.

Metasploit

Brad introduced talked a little bit about a tool named Metasploit, which is an open-source piece of software that is used for penetration testing in a network. It’s meant for use by professionals to use, but because it’s open-source, anyone can use it.

Metasploit contains a database of “quality-assured” exploits for use, which of course means amateur “hackers” can use it to help them break into restricted systems, but fortunately, the payloads in Metasploit aren’t great at covering tracks, so it’s used mainly for learning how to deliver your own payload.

The Stack

The stack is a data structure used to hold information. You can push stuff onto it, and pop stuff off the top.

The register EBP points to the base, or bottom of the stack, and the register ESP points to the top/end of the stack, where things are pushed off the stack.

stack-operation-in-c-programming

We saw in class a theoretical exploit where memory overflowed into the other variables in the stack, because the data is usually stored sequentially, so if you overflow from the top of the stack, it will invade the space of the variable below it.

Code Execution

First thing to do is to see how much data we control with our exploit, which registers are available, etc. Then determine how offset of the return address, which is basically where the eip register is, so we can execute our payload. What we did in class for this, was to exploit javascript. Next was to put our shell code into the spot where it will get executed, but first you have to find where to put it. A technique to help do this is called “trampolining”, which looks for an instruction called “jump esp”, which shouldn’t be used.

The bytes to represent the “jump esp” instructions are ffe4, so giving the address for the byte ffe4 to the module will force a jump to esp. Fortunately, this method won’t work on modern operating systems because they have been patched.

 

FSExploit

So the basic steps to actually accomplish code execution as seen above, we had to use the “msfPatternString” to find the offset. Then loading byakugan with “!load byakugan” gave us the offset on eip, which ended up being 1028.

Then, we changed the variable back to MakeString(1028/2), we had to divide by two because it would return 2 bytes for every 1 bytes requested. Now we could look for “jump esp” so we could put our shellcode in. The address ended up being 55442437, but we had to enter the bytes in reverse order. Then we had to add in 4 extra bytes because there was a “ret 4” that was popping arguments off the stack. Finally, we could add our shell code in, which was already coded for us, which gave us the calculator.

capture

Page Heap

Windows has a functionality that lets programs use a special heap that gives some extra special debugging info, we enabled it for Internet Explorer.

capture

+hpa : enables page heap

+ust: enables user stack tracing

This creates a registry flag in the registry that basically says to use the special page heap.

The point of this special heap is to help us create a use-after-free exploit, which is what it sounds like. It frees an object, replaces it with our object, which has the same size and allocations, then we position our shell code and use the object, which would end up executing our code. This lesson was also included with the FSExploit files.

DADA week 3 and lab (Malware Defenses)

Date: 2/4/17

What We Learned

Craig introduced ways that attackers would try to locate and catch victims, such as looking at popular Google Searches and targeting those. The first step in successfully “producing” malware is distributing it as far and wide as possible. Social engineering through deception and/or exploiting lack of knowledge. Once it’s on the disk/drive, it has to stay there, and this is done by having similar names as standard OS files or having signed binaries. Rootkits and Bootkits hide from the user so they can’t find the source of the malware.

Malware Defense

There are various ways to keep malware from getting in, with many layers that go on top of each other. Anti-Malware is the last layer of defense, which is on the disk, with the first layer being the Network Firewall and Network Intrusion Prevention. In my opinion, the first layer is your mind. Learning the warning signs of attempted malware intrusion and avoiding/preventing them from attacking are the cheapest and easiest way to prevent infection.

the-future-of-automated-malware-generation-48-638

Yara

Yara is a pattern matching based malware signature scanner that allows users to create their own malware signatures to specific malware on a machine. It’s much more simple than most programming languages but is powerful enough to provide robust searches with simple syntax.

Good yara signatures should capture unusual commonalities between malware groups without targeting normal operating system files. We created signatures then tested them on our virtual machine’s system32 folder.

Automated analysis and signatures are almost the entirety of signatures used nowadays, but the “best” are often handmade. A few different ways they are now making signatures is using machine learning to make better rules, as well as looking more at memory for signatures. It is now more important than ever to automate signature creation because of the sheer number of unique malware binaries. We were shown a graph that indicated that there were up to 300 million unique binaries.

Yara Activity

We had to create some YARA rules for the samples that Craig had us analyze.

My first rule was:

rule Sytro{ 
 strings: 
  $a="Delphi" 
 condition: 
  all of them 
}

Second:

rule CVE{ 
 strings:
  $b="DownloaderActiveX" 
  $c="Download" 
 condition:
  all of them
}

I just used the yara editor’s inspection functionality to find strings, but for sample 2 I had to use FileInsight because the files consisted of javascript that the editor couldn’t display in a readable fashion. I couldn’t find the CLASSID Craig gave us until I used FileInsight, which showed that it was used in the script that also contained DownloaderActiveX1. In sample group 1, a rule Craig used had the string “Jenna Jam”. Google search for this string gives interesting results, not malware related.

Third:

rule domai{ 
 strings: 
  $a="Tuguu" 
  $b="AVG" 
 condition:
  any of them 
}

This was much harder to do since the files were so hard to read with FileInsight and the yara editor, on top of there being so many lines of code for each one.

Cuckoo Tool

This tool contains a lot of previously seen functionality. It can see files created, deleted and downloaded, analyzes memory dumps of processes, and can trace network traffic.

Cuckoo Lab

The hash I looked at was 4844fd851088… I analyzed it as if it was malware until I found out it was not indeed malware. I guess I should’ve searched up the hash before writing. After my analysis of it I will analyze a malicious file.

Delphi was the obvious choice here, there weren’t a lot of strings to look at, so I had to pick what worked.

Analysis

In the lecture, we learned that Delphi is a programming language commonly used to write malware. Whenever the string “Delphi” appears, something malware related is sure to follow, so it would be an obvious target for a yara rule.

Looking through the CSV, we can see that the first thing that happens is that bad tries to open a registry in Software\Borland\Locales, and Software\Borland\Delphi\Locales. Both of these fail, most likely because those keys don’t exist yet. It also loads a few libraries such as uxtheme.dll,user32.dll, ntdll and ADVAPI32.dll. uxtheme.dll is “a system wide hook to intercept paint calls and injects skin data”, meaning it allows Windows to apply visual themes/styles to applications. It seems that bad is trying to create a hook, though its purpose is not clear. A function named ThemeInitApiHook is called to “give alternate implementations for functions” that user32.dll uses. user32.dll is an essential windows .dll that implements standard features such as the GUI and general user interface.

One notable thing that happened was that bad tried to run something named LADS. On the command line, it outputted copyright info pointing to Frank Heyne Software, at heysoft.de, a German hosted website. Below this info said “This program lists files with alternate data streams <ADS>”.  Alternate data streams are used to locate a specific file by author or title. While ADS is sometimes exploited for malicious purposes, it is not actually malware. I only realized this after writing all of this.

Analysis (hash: a1874f714f7a15399b9fae968180b303)

Yara used:

rule lab{
 strings:
  $a="Delphi"
  $b="Borland"
 condition:
  all of them }

This signature gives a false flag for 4844, which is apparently written with Delphi or something.

Running analyzer.py with this hash creates 3 CSV’s. The largest one’s process name is cmd.exe. First notable thing I see is a bunch of failed NtOpenKey attempts for various registries in the folder \registrymachine\software\policies\microsoftwindows\safer\codeidentifiers\26144\. Following that are a lot more NtQueryValueKey’s and more opening and closing. It fails to open half of the attempted registries, which result in an immediate NtClose.

After tampering with the registry it creates a file named Deleteme.bat in C:Users\Admin\AppData\Local\Temp\Deleteme.bat. This file was set to auto delete, obviously, but in the same directory a few other files were added. They were ntshruis2.dll, prints.exe, and qinput.png.

Qinput.png is a picture of a QQ login prompt. QQ is a popular Chinese messaging app/program. Prints.exe had its own CSV made, so lets look at that.

capture

Prints.exe loads kernel32.dll, advapi32.dll, oleaut32.dll, and user32.dll, and ntshruis2.dll. Using these libraries, it calls a ton of functions such as FindFirstFilA, CreateFileA, and GetCommandlineA. Presumably prints.exe is reading and writing to a file somewhere, but it is not said explicitly in the CSV. At the end of the CSV, I can see that it successfully creates a registry key in Software\Microsoft\Windows\CurrentVersion\Run. This new value is named WinSysQQ, with the buffer referencing prints.exe in the Temp directory. It also tries to open a registry key in Software\Borland \Locales, which it fails to do.

Google says WinSysQQ is a pop up that appears on Windows to promote ads and other junk. Looking back in the CSV I can now see a function that stands out, MessageBoxA. We know that user32.dll controls the Windows GUI, and this malware is using it to create unwanted pop ups on the system. Winsys is a name that sounds legitimate, but having QQ at the end is just a little extra the programmers added to reference the Chinese app.

The 3rd CSV created shows the bad process creating the 3 new files in Temp. It tries to make them hidden, but I think the VM by default displays all hidden files.

Reverting to my initial state and re-running analyzer with Flypaper.exe up this time allows me to read Deleteme.bat’s code.

capture

It seems that this is a simple batch file that is supposed to remove the source of this malware.

Conclusion

The point of all these exercises was to learn about malware threats and determine if they really are threats, to classify and isolate the malicious code, and prevent future attacks. Cuckoo allows us to identify what type of threat is happening while allowing us to classify the malware. Yara signatures and general analysis we did showed how to identify malicious code.

DADA Week 2 Writeup

This week was about digital forensics, the collection of evidence from electronic devices, usually for reasons related to cyber crime.

Major cases of cyber crime that involve digital forensics include: fraud, intellectual Property Theft, hacking, and electronic discovery(e-discovery), which is the process where data is sought and secured with the intent of use for a criminal or civil legal case.

Forensics in a Nutshell

There are 3 broadly classified categories in forensic computing:

  • Live forensics
  • Post-mortem based forensics(analysis after the fact)
  • Network based forensics

There are standards and methods of data collecting that must always be followed:

Minimize data loss, record everything, analyze all data colected(evidence) and report findings.

Evidence is anything that can be used to prove or disprove a claim. In forensics evidence can be found networks, operating systems, databases and apps, and removable media, such as disks and USBs. Admissible evidence is what courts accept as legitimate.

Preserving the Evidence

When handling evidence, it is crucial to perform procedures as so:

  • Create a cryptographic hash of the entire disk and each partition before analyzing. A cryptographic hash is used to verify the integrity of a file. It can be used to tell if a file has been tampered or changed. Popular forms of hashing include MD5 of SHA1.
  • Create bit-images of hard drives and analyze them. This means to create a bit for bit copy of the hard drive, so they will be identical.
  • Lock the original disk in a limited access room or container. This is to keep the disk safe from any outside influence from tampering with evidence.

Acquisition

What to acquire when looking for evidence:

  • Memory:
    • Virtual and Physical
  • Drive:
    • Entire physical drive
    • Logical: a partition
  • Network Traffic:
    • Full packet captures

Locard’s Exchange Principle

We learned about this principle this week, which states that when any two objects come in contact, there will be a transfer of material from each object onto the other. The main point is that it is impossible to interact with a system without affecting it in some way.

Locard’s principle is not a digital forensic principle exclusively, it also applies to real life crimes.

Volatility

Data can be volatile, which means it can be easily lost. There are degrees of volatility, and the most volatile must be acquire first, or be lost forever. Data that can be lost on powering down, is an example of the most volatile.

 

order-of-volatility-updated

Table showing the volatility of common device artifacts.

An example of the acquisition of data on Windows would be to acquire volatile data first, then non-volatile, which includes event logs and registry, if applicable. Lastly, obtain any relevant files, such as unknown executables, and any leftover tools.

Physical Memory

In a computer, RAM, or random access memory, is the source of short term memory on a computer. Once a computer is powered down, the information will begin to rapidly decay and be lost. There is tons of useful info that can be found in the RAM, such as data that is not obtained from a hard drive, as well as any leftover artifacts ‘hidden’ by hackers. Some examples of such data are processes at the time of memory snapshot, device drives, loaded modules and DLL’s, keystrokes, and wireless keys, among other things.

How does physical memory work? It is divided into “pages” and allocated space onto the physical memory page by page. Same pages of memory can appear in different locations and can be moved from physical memory into a page file to make more space. A page file is used to store data that RAM can’t hold.

Tools

We were introduced to some tools for analyzing data, one of them was Volatility, which is a memory forensics framework, which can be used to write and create plugins, on top of a lot free and useful tools that are available.

Yara is a tool that can create signatures for malicious behavior, which can then be used to scan for that malicious behavior.

FTK Imager

Christiaan showed us a tool name FTK Imager(Forensic Tool Kit). He made a point to never install forensic tools on suspect machines, calling it “the worse case thing you can do” because it can influence evidence, also known as Locard’s Principle.

One of the main functions of this program is to capture the memory from the computer of interest. It can create an image of a disk, USB stick, or capture RAM memory.

Christiaan says he prefers using the command line to do that same thing, but FTK is free and has a good GUI for learning. A drawback is that it leaves a large fingerprint in the memory.

Volatility

Command line based forensic tool. A cheatsheet with various commands was provided:

http://downloads.volatilityfoundation.org/releases/2.4/CheatSheet_v2.4.pdf

Ran with command:

 volatility.exe -f<name of memory dump> plugin

An extra tag, “imageinfo” can be used if you want to know more about the memory you want to analyze. He also used “psscan” to find hidden or terminated processes. “dlllist” is used to show .dll’s used by a process.

Photorec

An image recovery tool. Christiaan walked us through the process of mounting a virtual disk and recovering photos from it. This is called “carving”.

Conclusion

This week was about the collection of data and evidence in a safe manner that preserves the integrity of the original data. It is important to present the facts in an unbiased manner, so we learned how to collect without leaving traces of tampering.