SPO600 Project - Stage 2: Function Clone Detection and Analysis (Part 2)
Table of Contents Reproducing My Setup Detailed Test Results x86_64 Results aarch64 Challenges Capabilities and Limitations What My Implementation Can Do Technical Limitations Knowledge Gaps and Personal Reflections Technical Improvements for Stage III Conclusion Reproducing My Setup To replicate my work, follow these steps: First, create or modify pass file located at gcc/gcc/tree-amullagaliev.cc with implementation shown in Part 1. Navigate to your GCC build directory: cd ~/gcc-build-001/ Rebuild GCC with your modified pass: time make -j$(nproc) Test your pass using provided test cases: cd /path/to/test/directory tar -xzf /public/spo600-test-clone.tgz cd spo600/examples/test-clone make Detailed Test Results x86_64 Results On the **x86_64 **platform, my pass successfully identified both prune and no-prune cases: For prune test case: ;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=3954, cgraph_uid=24, symbol_order=23) ************************************************************ * ANALYZATION - Examining Function: scale_samples * ************************************************************ ************************************************************ * ANALYSIS FINISHED! ************************************************************ ;; Function scale_samples.popcnt (scale_samples.popcnt, funcdef_no=25, decl_uid=3985, cgraph_uid=30, symbol_order=28) ************************************************************ * ANALYZATION - Examining Function: scale_samples.popcnt * ************************************************************ PRUNE: scale_samples CLONE FOUND: scale_samples CURRENT: scale_samples.popcnt ************************************************************ * End of Diagnostic ************************************************************ For no-prune test case: ;; Function scale_samples.arch_x86_64_v3 (scale_samples.arch_x86_64_v3, funcdef_no=25, decl_uid=3985, cgraph_uid=30, symbol_order=28) ************************************************************ * ANALYZATION - Examining Function: scale_samples.arch_x86_64_v3 * ************************************************************ NOPRUNE: scale_samples CLONE FOUND: scale_samples CURRENT: scale_samples.arch_x86_64_v3 ************************************************************ * End of Diagnostic ************************************************************ aarch64 Challenges On aarch64 platform, I encountered this error: gcc -D 'CLONE_ATTRIBUTE=__attribute__((target_clones("default","rng") ))'\ -march=armv8-a -g -O3 -fno-lto -ftree-vectorize -fdump-tree-all -fdump-ipa-all -fdump-rtl-all \ clone-test-core.c vol_createsample.o -o clone-test-aarch64-prune clone-test-core.c:28:6: error: pragma or attribute 'target("rng")' is not valid 28 | void scale_samples(int16_t *in, int16_t *out, int cnt, int volume) { | ^~~~~~~~~~~~~ make: *** [Makefile:35: clone-test-aarch64-prune] Error 1 This error happens because rng is not valid target attribute for aarch64 architecture. Unlike x86_64 which has attributes like "popcnt" and "arch=x86-64-v3", aarch64 has different set of supported CPU features. This shows important cross-platform thing: Function Multi-Versioning attributes are architecture-specific, and code that works on one architecture may need changes to work on another. Capabilities and Limitations What My Implementation Can Do Base Function Identification: Correctly identifies base name of cloned functions by stripping variant suffixes. Resolver Function Detection: Recognizes and skips resolver functions, which handle runtime selection between clones. Basic Structure Comparison: Compares number of basic blocks and GIMPLE statements to determine if functions potentially equivalent. Output Requirements: Produces required PRUNE or NOPRUNE messages in correct format. State Management: Manages state between function calls to compare different clones using std::string for storage rather than C-style character arrays, making code more robust. Technical Limitations Superficial Comparison: Comparison based only on block and statement counts, not on actual code semantics. Two functions with same number of statements but different logic would incorrectly considered identical. No SSA Variable Normalization: Implementation doesn't normalize variable names before comparison, which more robust solution would do. Architecture Dependency: As shown by aarch64 error, current implementation doesn't fully handle cross-architecture differences in FMV attributes. Single Clone Assumption: Code assumes only one cloned function with two variants, as per project specs, but this isn't scalable to real-world codebases. Limited Structural Analysis: My pass don't analyze control flow structure within functions, which woul

Table of Contents
- Reproducing My Setup
-
Detailed Test Results
- x86_64 Results
- aarch64 Challenges
-
Capabilities and Limitations
- What My Implementation Can Do
- Technical Limitations
- Knowledge Gaps and Personal Reflections
- Technical Improvements for Stage III
- Conclusion
Reproducing My Setup
To replicate my work, follow these steps:
First, create or modify pass file located at
gcc/gcc/tree-amullagaliev.cc
with implementation shown in Part 1.Navigate to your GCC build directory:
cd ~/gcc-build-001/
- Rebuild GCC with your modified pass:
time make -j$(nproc)
- Test your pass using provided test cases:
cd /path/to/test/directory
tar -xzf /public/spo600-test-clone.tgz
cd spo600/examples/test-clone
make
Detailed Test Results
x86_64 Results
On the **x86_64 **platform, my pass successfully identified both prune and no-prune cases:
For prune test case:
;; Function scale_samples (scale_samples.default, funcdef_no=23, decl_uid=3954, cgraph_uid=24, symbol_order=23)
************************************************************
* ANALYZATION - Examining Function: scale_samples *
************************************************************
************************************************************
* ANALYSIS FINISHED!
************************************************************
;; Function scale_samples.popcnt (scale_samples.popcnt, funcdef_no=25, decl_uid=3985, cgraph_uid=30, symbol_order=28)
************************************************************
* ANALYZATION - Examining Function: scale_samples.popcnt *
************************************************************
PRUNE: scale_samples
CLONE FOUND: scale_samples
CURRENT: scale_samples.popcnt
************************************************************
* End of Diagnostic
************************************************************
For no-prune test case:
;; Function scale_samples.arch_x86_64_v3 (scale_samples.arch_x86_64_v3, funcdef_no=25, decl_uid=3985, cgraph_uid=30, symbol_order=28)
************************************************************
* ANALYZATION - Examining Function: scale_samples.arch_x86_64_v3 *
************************************************************
NOPRUNE: scale_samples
CLONE FOUND: scale_samples
CURRENT: scale_samples.arch_x86_64_v3
************************************************************
* End of Diagnostic
************************************************************
aarch64 Challenges
On aarch64 platform, I encountered this error:
gcc -D 'CLONE_ATTRIBUTE=__attribute__((target_clones("default","rng") ))'\
-march=armv8-a -g -O3 -fno-lto -ftree-vectorize -fdump-tree-all -fdump-ipa-all -fdump-rtl-all \
clone-test-core.c vol_createsample.o -o clone-test-aarch64-prune
clone-test-core.c:28:6: error: pragma or attribute 'target("rng")' is not valid
28 | void scale_samples(int16_t *in, int16_t *out, int cnt, int volume) {
| ^~~~~~~~~~~~~
make: *** [Makefile:35: clone-test-aarch64-prune] Error 1
This error happens because rng
is not valid target attribute for aarch64 architecture. Unlike x86_64 which has attributes like "popcnt" and "arch=x86-64-v3", aarch64 has different set of supported CPU features.
This shows important cross-platform thing: Function Multi-Versioning attributes are architecture-specific, and code that works on one architecture may need changes to work on another.
Capabilities and Limitations
What My Implementation Can Do
Base Function Identification: Correctly identifies base name of cloned functions by stripping variant suffixes.
Resolver Function Detection: Recognizes and skips resolver functions, which handle runtime selection between clones.
Basic Structure Comparison: Compares number of basic blocks and GIMPLE statements to determine if functions potentially equivalent.
Output Requirements: Produces required
PRUNE
orNOPRUNE
messages in correct format.State Management: Manages state between function calls to compare different clones using std::string for storage rather than C-style character arrays, making code more robust.
Technical Limitations
Superficial Comparison: Comparison based only on block and statement counts, not on actual code semantics. Two functions with same number of statements but different logic would incorrectly considered identical.
No SSA Variable Normalization: Implementation doesn't normalize variable names before comparison, which more robust solution would do.
Architecture Dependency: As shown by aarch64 error, current implementation doesn't fully handle cross-architecture differences in FMV attributes.
Single Clone Assumption: Code assumes only one cloned function with two variants, as per project specs, but this isn't scalable to real-world codebases.
Limited Structural Analysis: My pass don't analyze control flow structure within functions, which would be necessary for truly robust clone detection algorithm.
Knowledge Gaps and Personal Reflections
This project revealed several knowledge gaps I need address:
Architecture-Specific Compiler Features: I need deeper understanding of how architecture-specific features implemented across different platforms. My aarch64 issues highlighted this gap.
GCC GIMPLE Internals: While I understand basics of GIMPLE representation, I need more thorough understanding of how to analyze and compare GIMPLE statements.
Cross-Architecture Testing: I need better strategies for developing and testing features that must work across different architectures.
I found most challenging aspect to be understanding how to effectively compare function structures beyond simple metrics. Specifically, I struggled with how to normalize SSA variables and other identifiers to determine when two functions was semantically equivalent despite superficial differences.
The most interesting part was seeing how GCC implements Function Multi-Versioning - resolver functions and naming conventions used for variants gave me insight into how runtime feature detection works. Professor Chris Tyler lectures were incredibly helpful in understanding these concepts and inner workings of GCC. I couldnt figure this out without his explanations!
Technical Improvements for Stage III
For Stage III, I plan address these technical issues:
Deep Structural Comparison: Implement GIMPLE statement-by-statement comparison that normalizes SSA variables, labels, and basic block numbers.
Architecture-Agnostic Approach: Modify implementation to handle architecture-specific differences in more robust way.
Hash-Based Signature: Generate normalized hash or signature for each function structure to make comparisons more efficient and accurate.
Better State Management: Improve current state management to handle more complex scenarios with multiple clones.
Control Flow Analysis: Add analysis of control flow structure, which would capture logical equivalence of functions beyond just statement counts.
Conclusion
Stage 2 has been challenging yet enlightening. Working directly with GCC has given me a deeper understanding of compiler optimization techniques, particularly Function Multi-Versioning.
Cross-architecture challenges I encountered were unexpected but provided valuable learning experiences about how compiler features can differ between platforms.
While my current implementation fulfills the basic requirements of the project, the limitations I identified provide clear direction for improvements in Stage III. I'm particularly interested in developing a more robust comparison algorithm that can accurately determine when two functions are semantically equivalent despite sketchy differences.
Professor Chris Tyler guidance and lectures been instrumental in this journey, providing foundation of knowledge needed to tackle these complex compiler topics. Without his clear explanations, navigating GCC internal structures would be much more difficult.
The journey continues in Stage III, where I'll improve these techniques and address the limitations identified here.
Note: All code for this project is available in my GitHub repository. Feel free to clone it and follow steps above to reproduce my results.