Tag: software engineering

Entries for tag "software engineering", ordered from most recent. Entry count: 32.

Warning! Some information on this page is older than 5 years now. I keep it for reference, but it probably doesn't reflect my current knowledge and beliefs.

Pages: 1 2 3 4 >

# 6th tip to understand legacy code

15:41
Sat
20
Jan 2018

I just watched a video published 2 days ago: 5 Tips to Understand Legacy Code by Jonathan Boccara. I like it a lot, yet I feel that something is missing here. The author gives 5 tips for how can you start figuring out a large codebase that is new to you.

  1. "Find a stronghold" - a small portion of the code (even single line) that you understand perfectly and expand from there, look at the code around it.
  2. "Analyze stacks" - put a breakpoint to capture a representative moment in the execution of the program (with someone's help) and look at call stack to understand layers of the code.
  3. "Start from I/O" - analyze the code that processes data at the very beginning (input data, e.g. source file) or at the end (output data).
  4. "Decoupling" - learn by trying to do some refactoring of the code, especially to decouple some of its parts.
  5. "Padded-room" - find a code that doesn't depend on any other (has limited scope) and go from there.

These are all great advices. I agree with them. But computer programs have two aspects: dynamic (the way the code executes over time - algorithms, functions, control flow) and static (the way data are stored in memory - data structures). I have a feeling that these 5 points focus mostly on dynamic aspect, so as an advocate of "data-oriented design" I would add another point:

6. "Core data structure": Find structures, classes, variables, enums, and other definitions that describe the most important data that the program is operating on. For example, if it's a game engine, see how objects of a 3D scene are defined (some abstract base CGameObject class), how they can relate to each other (forming a tree structure or so-called scene graph), what properties do they have (name, ID, position, size, orientation, color), what kinds of them are available (mesh, camera, light). Or if that's a compiler, look for definition of Abstract Syntax Tree (AST) node and enum with list of all opcodes. Draw UML diagram that will show what data types are defined, what member variables do they contain and how do they relate to each other (inheritance, composition). After visualizing and understanding that, it will be much easier to analyze the dynamic aspect - code of algorithms that operate on this data. Together, they form the essence of the program. All the rest are just helpers.

Comments | #software engineering Share

# The Virtual Reality of Code

18:59
Wed
11
Nov 2015

In my opinion, coding is a virtual reality with its own set of rules unlike in the physical world. In physical world, each thing has its specific location at the moment, things don't appear or disappear instantly, we have laws of energy and mass conservation. Inside a computer, we have data in memory (which is actually linear - 1D) and processors, processes and threads executing instructions of a program over time.

I have been trying for years to imagine some nice way editing or at least visualizing code, which would be more convenient (and look more attractive) than the text representation we all use. I am unsuccessful, because even if we render some nice looking depiction of a function code, its instructions, conditions and loops, drawing all the connections with variables and functions that this code uses would clutter the image too much. It's just like this virtual world doesn't fit into a 2D or 3D world.



Movie depiction of computer programs is often visually attractive, but far from being practical. This one comes from "Hackers" movie.

Of course there are ways of explaining a computer program on a diagram, e.g. entity diagrams or UML notation. I think that electrical diagrams are something in between. Electrical devices end up as physical things, but on a diagram, the shape, size and position of specific elements doesn't match how they will be arranged on a PCB. Logical representation and electrical connections between them is the only thing that matters.

Today it occured to me that this virtual reality of code also has "dimensions", in its own sense. It's evident when learning programming.

1. First, one has to understand control flow - that the processor executes subsequent instructions of the program over time and that there can be jumps caused by conditions, loops, function calls and returns. It's called dynamic aspect of the code. It can be depicted e.g. by UML sequence diagram or activity diagram.

I still remember my colleague at university who couldn't understand this. What's obvious to every programmer, was a great mystery to him as first year's computer science student. He thought that when there is a function above main(), instructions of that function are executed first and then instructions of the main function. He just couldn't imagine the concept of calling one function from the other and returning from it.

2. Then, there is so called static aspect - data structures and objects that are in memory at given moment in time. This involves understanding that objects can be created in one place and destroyed in another place and at later time, that there may be a collection of multiple objects (arrays, lists, trees, graphs and other data structures), objects may be connected to one another (the concept of pointers or references). Various relationships are possible, like one-to-one and one-to-many. In object-oriented methodologies, there is another layer of depth here, as there are classes and interfaces (with inheritance, composition etc.) and there are actual objects in memory that are instances of these classes. It can be depicted for example by entity diagram or UML class diagram.

3. Finally, one has to learn about version control systems that store history of the code. This adds another dimension to all the above, as over successive commits, developers make changes to the code - its structure and the way it works, including format of data structures it uses. Branches and merges add even more complexity to it. GUI apps for version control systems offer some way of visualizing this, whether it's showing a timeline with commits ("History") or showing who and when commited each line of a file ("Blame").

There is even more to it. Source is organized into files and directories, which may be more or less related to the structure of contained code. Multithreading (and coprocessing e.g. with GPU, SIMD, and all kinds of parallel programming) complicates imagining (and especially debugging!) control flow of the program. Program binaries and data may be distributed into multiple programs and machines that have to communicate.

It fascinates me how software is so multidimensional in its own way, while so flat at the same time (after all, computer memory is linear). I believe that becoming a programmer is all about learning how to imagine, navigate and create stuff in this virtual reality.

Comments | #philosophy #software engineering Share

# The Importance of Good Debugging Tools

05:29
Thu
28
May 2015

Robert L. Glass wrote an interesting article "Frequently Forgotten Fundamental Facts about Software Engineering". While I am aware he is far more experienced than me and he has sources to backup his opinions, I dare to disagree with following paragraph:

T1. Most software tool and technique improvements account for about a 5- to 30-percent increase in productivity and quality. But at one time or another, most of these improvements have been claimed by someone to have "order of magnitude" (factor of 10) benefits. Hype is the plague on the house of software.

I think that having good tools is one of the most important (and sometimes underrated) factors in programmers' efficiency. From comfortable chair, through fast desktop PC, big monitors, fast Internet connection, to good software, including IDE (editor, compiler, debugger etc.) and auxiliary applications (like Total Commander for file management) - they can all make big difference in how developers feel about their job and how fast the work is going.

Of course, "tools" is a broad term. If by choosing good tools we mean changing used programming language or libraries in the middle of the project, then sure it is usually a bad idea. Writing a script to automate some task that can be done manually in just few minutes or even an hour is usually not worth doing as well.

But what is always worth investing time and money in (either buying or developing your own, learning to use them) are tools that help with debugging. As debugging is the hardest and most time consuming part of programming, any improvement in that process can make as big difference as between minutes and weeks - often in the most critical time, close to the deadline.

So in my opinion, being able to interactively debug executing program (or its trace) - to setup a breakpoint and preview current value of variables (as opposed to relying only on some debug text prints, analyzing logs or some command-line tools) is an absolute minimum to be able to call any programming environment reasonable, mature and eligible to write any serious software in it. If you are the one who writes a program and you have to treat that program as a black box while it is running, without a possibility to peek inside, then something is wrong. AFAIK that is the case with some technologies that deploy program to a server or an embedded device.

How is it in graphics programming? John Carmack once tweeted:

gl_FragColor.x = 1.0; is the printf of graphics debugging. Stone knives and bearskins.

While I share his frustration with such primitive methods of debugging, the reality of today's GPU programming is not all that bad as we could only render red pixels to see any debug information. NVIDIA Nsight offers debugging of a GPU, and so does Intel Graphics Performance Analyzers (GPA). Finally, there is a vendor-agnostic debugger for DirectX developed by Microsoft. Originating from Xbox (that is where its name comes from - Performance Investigator for Xbox), PIX used to be available for free as a standalone application. Recently, they have integrated it as part of Visual Studio called Graphics Diagnostics. It was available in commercial versions of Visual Studio only and not in free Express edition (very bad news for all amateur DirectX game developers), but finally they shipped it for free together with the new, fully-functional Visual Studio Community edition.

With these tools, there is no explanation for whining "nothing renders and I don't know why" - just go debug it! :) The future also looks bright. DirectX 12 will offer Debug layer. Vulkan also claims to support debugging tools and layers and LunarG company even started working on such tool - GLAVE.

Comments | #debugger #software engineering Share

# Too Low Level and Too High Level Abstraction

22:23
Fri
15
Jul 2011

Programmers fascinated by object-oriented methodology love the idea of abstraction and generalization, but others - especially those interested in data-oriented design - consider "premature generalization" as a big anti-pattern. I prefer the latter approach, so as I recently learn a little bit of Java, including Servlers and JSP, I feel terrified of how its API looks like. Let me explain...

Today an idea came to my mind that an API can be too abstract (general, universal) at low level or high level. The most abstract low-level API looks like this:

int write(void *buf, int bytes);
int read(void *buf, int bytes);

It's so general and universal that it would fit anywhere, but on the other hand it doesn't provide anything specific. It's just a very narrow pipe we have to pass all our data through. Of course this method is sometimes useful. For example that's the way we transfer data over network. But we do it using some specific protocol so it's reasonable to define a higher-level API with some objects that encapsulate concepts specific to that protocol, just like cURL implements HTTP, FTP and other network protocols on top of network sockets.

Let's compare some details from two APIs. A socket can have additional options that can be set. There are different options for different kinds of sockets and they have different types - some of them are bools, others are ints. But there is only one function to set such options - setsockopt:

int setsockopt(SOCKET s, int level, int optname, const char *optval, int optlen);

Objects in POSIX Threads API also can have additional attributes, but there are specific functions to set them, like in the example below. Which way do you prefer?

int pthread_attr_setdetachstate(pthread_attr_t *attr, int detachstate);
int pthread_attr_setstacksize(pthread_attr_t *attr , size_t stacksize);

But excessive abstraction can also go too far the other way. That's what I see in Java API. I mean, these who code a lot in this language certainly can "feel" the sense in this approach, but for me it's almost ridiculous. Not only every part of the Java API has dozens of small classes with not much more than simple getName/setName methods and there is an interface for everything just like people were afraid of using real classes, but lots os stuff is refered by strings. I'd say Java is so high level that it's not only a strongly-typed language, it's stringly-typed language. Lots of classes implementing some interface can be registered in some global registry under its name so an instance can be constructed without ever refering to the real class, like: new SomeSpecificClass().

Probably the most grotesque example I saw today is java.naming package. Including its subpackages, it contains about hunred of classes and interfaces. But there is more than this. It's the whole great body of knowledge called JNDI (Java Naming and Directory Interface) with long tutorials and lots of concepts to understand, like Context, InitialContext, Directory, Service Provider, Lookup, Names, Composite Names, Compound Names, Naming System, Federations and so on... All this just to provide an abstract interface for a tree of any objects referred by name, so that disk files and directories can be accessed same way as data received with LDAP or some global variables defined in an XML file. Do anyone really needs that? The javax.naming.Context interface is not far from what would be the ultimate high level abstraction:

interface SuperAbstractEverything {
  public Object getObject(String namePathIdOrWhatever);
};

Comments | #java #software engineering Share

# My Talk in Kraków - Slides

22:47
Wed
30
Mar 2011

I just went back from Kraków where I've given a talk at AGH University of Science and Technology (Akademia Górniczo-Hutnicza), at SKN Shader scientific group. The title of my presentation was "Pułapki programowania obiektowego" ("Pitfalls of Object-Oriented Programming"). Here are the slides: [PDF], [PPTX]. The presentation is new, although you could already see most of the slides at IGK conference or at Polygon scientific group.

By the way, I want to thank Koshmaar, TeMPOraL and all members of the SKN Shader for the wonderful time in Kraków. You can see several photos that I took there in my gallery on Picasaweb and a panorama photo on Panogio.

Comments | #events #software engineering #teaching Share

# March 29th, Kraków - my Next Presentation

23:23
Mon
28
Mar 2011

Tomorrow I go to Kraków to give a talk at AGH University of Science and Technology (Akademia Górniczo-Hutnicza). I've been invited by my friends from SKN Shader scientific group (greetings Koshmaar and TeMPOraL :)

The title of my presentation will be "Pułapki programowania obiektowego" ("Pitfalls of Object-Oriented Programming"). The event will take place at 18:30, building B1, room H24. More information is available in this entry on SKN Shader website, as well as Facebook event.

Comments | #events #teaching #c++ #software engineering Share

# Static C++ Code Analysis with PVS-Studio

19:42
Sat
12
Mar 2011

By the courtesy of its authors, I have a chance to evaluate PVS-Studio - a static code analyzer for C, C++ and C++0x. This commercial application is installed as a plugin in Visual Studio 2005/2008/2010. Fortunately I have Visual Studio 2008 Professional at home so I could try it with the code of my personal projects. PVS-Studio differs from other tools of this kind, like free Cppcheck, by finding three types of errors or warnings: general, related to OpenMP and 64-bit portability issues.

After opening my solution in Visual Studio, I choose a command from the special menu to analyze all the code.

A progressbar appears while PVS-Studio does the computations, utilizing almost 100% of all 4 CPU cores. Finally, a dockable panel appears with a list of found issues.

The general category warns about exact float comparison with == and stuff like that. It managed to find few places where I forgot the "&" character while passing a vector as const refefence parameter, rightly telling that it will cause "decreased performance". But its greatest find in my CommonLib library code was this unbelievable bug:

Some messages look funny. Should I code some general, abstract, portable, object-oriented, Alexandrescu-style template-based solution here just to avoid copying some code into several similar instructions? :)

I didn't check how the OpenMP validation works because I don't currently use this extension. As for 64-bit compatibility issues, I have lots of them - just because my code is not prepared to be compiled as 64-bit. PVS-Studio seem to do a good job pointing to places where fixed-length 32-bit integers are mixed with pointers, array indexing etc.

Overall, PVS-Studio looks like a good tool for C++ programmers who care about the quality of their code. Finding issues related to OpenMP and 64-bit compatibility can be something of a great value, if only you need such features.

Too bad that PVS-Studio, opposite to Cppcheck, is a Visual Studio plugin, not a standalone application, so it obviously requires you to have a commercial MSVS version and do not work with Express edition. But this is understandable - if you need OpenMP or 64-bit, you probably already use Visual Studio Professional or higher.

PVS-Studio analyzes C, C++ and C++0x. It doesn't work with C++/CLI language, but that's not a big flaw too. I use C++/CLI at work, but I can see it's quite unpopular, niche language. Its compilation or analysis would also be very difficult because it mixes all features from both native C++ and .NET. Even Microsoft didn't find resources to implement IntelliSense for C++/CLI in Visual Studio 2010.

Comments | #tools #c++ #software engineering #visual studio #pvs-studio Share

# Data-Oriented Design - Links and Thoughts

11:20
Thu
06
Jan 2011

In April 2008 I've written an essay "Fanatyzm obiektowy" (in Polish, it means "Object-Oriented Fanaticism"). I've always believed there is something wrong with object-oriented programming, that it simply doesn't meet its own objectives and so following it blindly as an ideology not only a programming language mechanics has many pitfalls. Now I'm glad that recently a concept of "Data-Oriented Design" (DOD) emerged and gained popularity among game developers. Here is my try to aggregate all important information on this subject that can be found on the Internet:

Slides:

Blog entries:

Other:

If you know any other good readings on this subject, please leave a comment. I'll update my list.

As far as I can see, focusing more on data instead of objects gives a number of benefits for the code:

Of course DOD doesn't exist in the void. It's related to many other concepts and you can find many good sources of knowledge about each one. Some of them are:

Comments | #software engineering #c++ Share

Pages: 1 2 3 4 >

STAT NO AD
[Stat] [STAT NO AD] [Download] [Dropbox] [pub] [Mirror] [Privacy policy]
Copyright © 2004-2018