CppGen - Ugur Buyukdurak: Difference between revisions

From CS486wiki
Jump to navigationJump to search
Content deleted Content added
Line 195: Line 195:


CppGen is only one term project. It is not going to cover second term. A commandline chat application will be the project for the second term.
CppGen is only one term project. It is not going to cover second term. A commandline chat application will be the project for the second term.

UPDATE:

For compiling purposes Cmake is a crucial part of the project. There is no compiled version of CastXML in Linux repositories yet. For that reason, it is
important to be able to compile CastXML by hand.


'''1.5 Project Assumptions'''
'''1.5 Project Assumptions'''

Revision as of 06:07, 3 November 2015

Project Background

Project Definition: A project that aims to shorten C++ development time by automatic code generation. This project targets people who don't have access to full-featured C++ IDEs on the market. C++ language has some strict syntax rules with which not everybody can do everything right every time. This, of course, would mean compiler errors and frustrated programmers. Even if a programmer typed everything correctly, he would still be wasting his time with extra typing because of C++ header and implementation file mechanism. More typing means more work, and the more it gets longer, the more mistakes can be made along the way. Automatic code generation zeros the chance of making mistakes when writing implementation(.cpp) of a declaration file(.h) unless base header file is wrong.

Starting Point: I originate this idea from CS 240 Data Structures Class where I had to code in a complete Linux environment without a full featured IDE. There was no syntax highlighting, type checking or anything similar. There was just a raw text editor and a command line compiler. Compiler would always complain I typed something wrong, and say I needed to fix something. Problems were mainly caused by some variables or methods of a class instance that didn't match what it was declared as in another file. So I thought, generating some source code from base code would ease my job a lot further so I could focus on what I was doing.

The Real Work: Although I had the idea that generating cpp (C++ implementation files) from header files would be great, implementing the project was a serious challenge for me. Even though I had the idea in mind, there were lots of gaps in my mind that had to be filled. Was I going to implement my own compiler, or was I going to use existing tools to achieve what I wanted? If I wanted to implement my own compiler, how could I do that or if I wanted to use existing tools what were they?

The real work has never been coding since coding is all about knowing particular syntax rules and general flow of a programming language with some fundamental understanding. Real challenge was to figure out ways in which intended goal could be accomplished which required long hours wasted for finding right tools.

Another big challenge after figuring out how something could be done was to understand how these "right tools" worked. With several months spent trying to understand a Python library and a program both related to dumping XML of a C++ file and reading it the XML dump file, I was finally ready to do something useful and real. It wasn't "in mind" anymore. It had already gone out of mind.

After 3-4 months of research and understanding, I was finally able to code the project. The time it took for me to code the project was just "one" month. It was actually much shorter but I had other things I did in the meantime.

To develop something is a challenge but to deploy it is another challenge. By knowing just pure code doesn't help to use an operating system efficiently and correctly. Computers are all about operating systems. So it is crucial to know to use them in a way that fits your needs. I had completed courses and received two certification for Linux System Administration before I ever needed to deploy my project on the web. The skills I have gained during the System Administration really helped me manage my project efficiently. I was able to make it online in just one day.

The point I am trying to make is that it wasn't hard coding or deploying the project, but hard part was the time passed researching and learning and gathering all these different things to bring one meaningful thing into existence.

What I learned: I had the chance to develop my System Administration skills a lot since I had to deal with ssh, Linux File System, Web Servers and the Linux operating system itself. Besides those, I also had to have fundamental understanding of DNS system, TCP/IP interface, Python Web Frameworks, Python templating Engines to create automated text, several Python libraries, so forth and so on. But most importantly I have realized that hardest part is never coding but rather analyzing.

One important thing that I didn't mention before is importance of a design flow of a program. It is important to design a program in a way in which it can be extended and modified easily. Time, requirements and demands all change and so should the program without a need to recreate it from the beginning.

Reliability: CppGen is reliable in a way that it doesn't implement its own compiler but rather uses recognized tools used in computer world to make its own thing happen. There is little chance that building blocks that make CppGen will cause problems from the ground.

Main Goals of the Project:

-To create a world known website used by programmers, universities, students and all alike around the world.

-To contain no commercial purpose. Everything should be open source.

-To join hackathons, and be recognized.

-To make a website or a tool that has the capacity to be realized by authorities within the tech world.

-To receive some donation or funding if possible.

-To replace some of functionalities provided by commercial IDEs

-To be up to date.

-To create a functional and fast website without complexity.

-To be reliable.

-To be able to implement my own compiler(if at all) at some point rather than using already made tools.

-To feel that I didn't go to university for nothing.

-To know that it completely belongs to me with nobody got involved in the project besides

Before and After

How will this project fit into Senior Project?: Senior Project is a way for students to prove their undergraduate education means more than just a simple diploma, it requires students to build things which require combinations of different set of skills gained during their undergraduate program combined with their creativity. cppgen already has that. To build it, I needed to be more than a student, a programmer, a system administrator or a student.

Where am I?: Since many parts of this project had already been done before senior year even started, this project is not going to start from ground for the Senior Project. My website has already had its features. It has multi-threading execution, file upload, generation of couple files in a particular format and many more. It is in working condition but not really usable and practical yet. For detail FAQ page can be read at cppgen.com/faq.

Where will I be heading to?: I've already implemented many major goals I had had in mind. Along the way I have learned a lot as to what to expect, project management, possible failures, possible problems and differences between theory and practice. But that doesn't mean I don't have other plans. There are many more lessons to learn from not giving up and going on. Each step I will take will become another lesson.

My main intention is to make my existing website more usable, beautiful and famous. To make that I need to have a plan.

Things may/may not change: There are some building blocks that cppgen relies on but may change. One of them is gcc-xml program which is used for generating XML dumps of a C++ file. Although it provides project's base functionality, newer technologies should be sought and their compatibility should be checked with the existing project. Possible alternatives are:

GCC-XML: Cppgen is already built with gcc-xml. Source files uploaded to server are passed to gccxml, then temporary xml representations are read by a python library pygccxml. The problem with gcc-xml is it is old technology(supports C++98 standard) and doesn't follow newer standards. It is now succeeded by CastXML.

CastXML --> gcc-xml is replaced by castXML, it also supports C++11 standard. Problem here is that gccxml is fully compatible with python library pygccxml. I am not sure if CastXML is also compatible. I had contacted their implementors at some time ago. I will do so again to have some information.

Clang++ --> said to used to have XML dumping property. But not anymore. Clang still has AST(Abstract Syntax Tree) with itself. It just doesn't produce XML representation. If there is a way to parse Clang++ AST, then this is another option to use.

Doxygen --> This is a recent option I came across with. It has xml dumping property with support of C++11. However, it is said that it has few bugs as to C++11 syntax. Other problem is that I don't know if there exists a any possible way to parse xml output of the program. Otherwise I will have to implement my own, if you choose using doxygen.

UPDATE: I have tried to contact current developer of pygccxml. He had responded me long time ago but I am not sure if he will do the same this time. Also I have subscribed to castXML mailing list and will be posting some question in near future.

UPDATE 9/29/2015: I've talked with one of the professor as to get in idea about possible options. One possible option he offered was to parse AST (Abstract Syntax Tree) dumped by clang++ compiler. He said it could be parsed within the program (I didn't quite get what he meant by this), and could be used for the same function that gccxml is used for. What this means is that if I wanted to use clang++ compiler, then I would also have to implement my own pygccxml library which would be, of course, another project itself.

UPDATE 9/29/2015: Some comments on the internet suggest Doxygen. For now, I don't have an idea how doxygen could be used or would fit with this project.

Clang AST Introduction Video : https://www.youtube.com/watch?t=67&v=VqCkCDFLSsc

A video about castXML and integration with pygccxml : https://www.youtube.com/watch?v=O2lBgtaDdyk

Here is advantage of castxml over gccxml : http://permalink.gmane.org/gmane.comp.compilers.gccxml/730

UPDATE 09/30/2015: A response from castXML mailing list has just arrived. According to the response, although castxml is able to parse C++11 specification, its xml representation won't be C++11 but C++98 instead. Which would mean that header files submitted to my website using C++11 specification would not be rejected by the base program and be parsed but since output generated by castxml is not structured based on C++11 but rather C++98, automaticly generated code based on C++98 representation of a parsed c++11 parsed code would include no C++11 characterictics. So the question is, is it worth shifting the base technology? I will put a quote from the response,

On 09/30/2015 10:36 AM, Uğur Büyükdurak wrote:

> I would like to know whether it is possible or not to build and use castxml
> on windows machines.
Yes.
> I would also like to know whether it is compatible with pygccxml or not.
See here for pygccxml status with regard to castxml:
https://github.com/gccxml/pygccxml/issues/19
> It is also mentioned that a C++ compiler supporting c++11 syntax and
> clang/llvm compiler(SDK install tree built) are needed which I don't
> understand why installing process needs two different compilers at
> the same time. So basically why does castxml need clang and why does
> it need another C++ compiler.
CastXML needs the LLVM/Clang SDK in order to build.  While gccxml came
with its own patched copy of GCC, CastXML builds directly against an
externally-built LLVM/Clang.  Since Clang is implemented in C++11,
CastXML is also implemented in C++11.  Therefore one needs a C++11
capable compiler to build both of them.  The Clang compiler is not
executed or used to compile anything but it must be built to get its
SDK.
Once CastXML is built and installed then it works independently
from the original LLVM/Clang SDK installation.
> And one thing I forgot to ask, does castxml support C++11 specification?
CastXML does support parsing C++11 code but only the C++98 subset
of the interfaces will be included in the output (no APIs with
rvalue references for example).  One can pass -std=c++11 to castxml.
-Brad

Project Structure

1.1 Background

I have already listed project's background and its primary purpose in the previous field.

1.2 Project Objectives

There are couple objectives to get done. CastXml will be integrated into existing project. CastXml tests with pygccxml are already seen by me, for now they seem compatible to the extend I need them. There are problems when virtual functions and inheritance get involved. These problems need to be fixed. Other change to get done is to change structure of the source code. It needs to be redesigned based on design patterns to make it more modular. Structure of source code is not really flexible. Things that are piped to standard output and standard error should be directed to files in the filesystem. Later these files can be reached to print syntax errors from users' header files on html so that users can see their syntax errors and warning if there any. Setup scripts are needed to move the project among various machines. It is hard to manage every source file and third party libraries by hand. So there is a need for kind of source code management system that fits to the project's needs. User interface needs to be changed. Drag and Drop interface seems to be the best possible solution because it has the ability to provide ease of use for users. There will be also a commandline version of the program that will be used directly from the command line. For the command-line application there may have to be a need for a setup script. Source code will also be uploaded to github for public eyes can see it. To summurize:

1-)Castxml will be build from a source code in a linux machine (fedora probably), and will be integrated to the project.

2-)Fix problems with inheritance and virtual functions.

3-)Structure of Code

4-)Some of information on stdout and stderr should be reflected to user by means of HTML

5-)Various setup scripts depending on needs.

6-)Drop and drag user interface.

7-)Command line version of the program.

8-)Source code will be uploaded to Github for public trust and opinion along with help.

UPDATED:

--People I talk to have said it would be really useful to include separate program for generating makefiles.

9-)Separate section in the website that help create makefiles.

10-)Separate command line utility that helps people create makefiles.

11-)Adding compilation options from which people can choose.

12-)Performance Tests to see its worst case running time. For example O(n), O(logn).....

13-)Source code compilation options for those who don't have compilers installed in their computers.

1.3 Project Constraints

1-)Project shouldn't go out of scope.

2-)It is one-term project. Project for the second term will be a command-line chat application that does not maintain anything about its users actions

3-)There are not too much base technology available except writing my own compiler. Of course it is impossible for me to do so. (for now at least)

4-)Castxml will be used as base technology, other options has depreciated.

5-)Project is already written in Python, Python will be used extensively with Jinja2 Templating Engine and Flask Web Framework.

6-)Compilers to be used will be g++ and clang++.

1.4 Project Scope

An already installed linux system is needed. Full-featured Python IDE is crucial for big scale source code management. Pycharm had been used and will be used for the purpose. Fedora is the primary choice of linux distrubition due to the fact that it is one of the most popular distribution with a huge user community. Communication with castxml family is needed. It is important to subsricibe to their mailing list. A remote cluster is needed for deploying the final application and making it online. digitalocean.com is the primary choice of remote clusters because it allows complete installation of system from the beginning. That means it is easy to customize it based on different needs. There will probably a need for consulting about various aspects of Linux system. Internet is being considered to be primary source of information.

CppGen is only one term project. It is not going to cover second term. A commandline chat application will be the project for the second term.

UPDATE:

For compiling purposes Cmake is a crucial part of the project. There is no compiled version of CastXML in Linux repositories yet. For that reason, it is important to be able to compile CastXML by hand.

1.5 Project Assumptions


There is an assumption of castxml and pygccxml compatibility. There is also another really important assumption of Castxml's ability to parse C++11 specification. All the project will be built upon these two assumptions. A failure at one of two means possible failure for the project. Source code management and user interface are separate technologies that are not related to parsing C++ syntax. There are no assumptions made about those in terms of castxml and pygccxml. They can be implemented separately.